A Global Ensemble Forecast System (GEFS)-based Synthetic Event Set of U.S. Tornado Outbreaks

Malloy, Kelsey; Tippett, Michael K.

doi:10.5194/egusphere-2025-3145

Preprints

https://doi.org/10.5194/egusphere-2025-3145

Preprints

22 Jul 2025

| 22 Jul 2025

A Global Ensemble Forecast System (GEFS)-based Synthetic Event Set of U.S. Tornado Outbreaks

Kelsey Malloy and Michael K. Tippett

Abstract. Severe convective storms (SCS) are important drivers of global insured losses, and tornado outbreaks — when many tornadoes occur within a short time span — cause extreme and localized loss of life and property. Tornado outbreak risk estimates from observations, either storm reports or reanalysis environments, are limited by meteorological conditions that have occurred in the historical period. A standard approach of addressing this inadequacy is to construct synthetic event sets that consist of unrealized but plausible events that better represent the full range of possible outcomes. In this study, we constructed and evaluated a synthetic event set of U.S. tornado outbreaks using Global Ensemble Forecast System (GEFS) environments and a tornado outbreak index. With over 800,000 daily maps of environments, over 200,000 synthetic events are generated, and, in a seamless framework, the synthetic event set includes "daughter events", constructed from short-lead forecasts and resemble historical events, as well as independent physically plausible events, constructed from longer-lead forecasts. With the GEFS synthetic event set, we estimated that the 1-in-100-year and 1-in-1000-year U.S. tornado outbreak event has 150–250 and 275–400 (EF/F1+) tornadoes per day, respectively. The GEFS synthetic event set also shows robust shifts related to ENSO — higher outbreak activity during La Niña conditions — and trends — increased outbreak activity during 2010–2019 compared to 2000–2009 — consistent with reports. We also developed a subsampling procedure to estimate locally specific tornado outbreak risk, which we illustrate by generating return level curves for grid cells that cover Dallas, Nashville, and Chicago.

Received: 01 Jul 2025 – Discussion started: 22 Jul 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Kelsey Malloy and Michael K. Tippett

Status: final response (author comments only)

CC1: 'Comment on egusphere-2025-3145', Adam Scaife, 28 Jul 2025

At line 50 this manuscript discusses stochastic or partial physics approaches to event sets but this omits a whole new set of literature using full physics models to create large event sets using ensembles that are physically plausible. These include:
Extreme heatwave days: https://rmets.onlinelibrary.wiley.com/doi/10.1002/wea.7741
Monsoon rainfall: https://iopscience.iop.org/article/10.1088/1748-9326/ab7b98
and Extreme rainfall: https://www.nature.com/articles/s41612-020-00149-4

Citation: https://doi.org/10.5194/egusphere-2025-3145-CC1
RC1:
'Comment on egusphere-2025-3145', Anonymous Referee #1, 13 Aug 2025
Review of “A Global Ensemble Forecast System (GEFS)-based Synthetic Event Set of U.S. Tornado Outbreaks
Malloy and Tippett, 2025, NHESS
This article describes a technique to generate synthetic tornado events from a reforecast dataset from the Global Ensemble Forecast System (GEFS). Synthetic events are created from short- and long-lead GEFS reforecasts by using simulated environments to predict the probability of a tornado outbreak occurrence, using previously establish formulas from Malloy and Tippett (2024). Precipitation, SRH, and CAPE are used empirically to derive a map of probabilities defining the outbreak index. The probabilities are further used to compute the expected number of outbreak tornadoes given the environment. Notably, these probabilities and indeces are computed across all GEFS ensemble members (5-11 members depending on the initialization) to build up a larger dataset of outbreak tornadoes than what is provided in the current observational record. These simulated environments are plausible realizations of tornado occurrences even if they didn’t happen exactly in the past – they could in the future!
From this index the authors construct return intervals for outbreak tornadoes, comparing the GEFS-based intervals to those from the observation record, and further delineate how return intervals can describe seasonal differences, reporting trends (i.e., 2000-2009 vs. 2010-2019), climate teleconnections (ENSO), and local/regional tornado risk. A limited observational record cannot extrapolate beyond approximately 1 in 40 yr events while the simulated GEFS environments provide context for 1-in-100 and 1-in-1000 yr events at daily, yearly, and local levels.
I found the paper to be extremely easy to follow and carefully constructed – small technical edits provided below regarding the use of articles. I think this type of work will have broad implications across the insurance/reinsurance industry as the authors suggest, but also in the severe storms community as we grapple with how to improve our observational records (e.g., using radar observations to supplement the tornado record, hail record, etc.). My biggest comment on the manuscript is related to the physical constraints on the system that either allow, or don’t allow as I suspect, the realization of 1-in-100 or 1-in-1000 yr outbreak tornado events. The authors propose that a 1-in-1000yr event in Nashville would amount to 17 tornadoes in a single day within a 1 degree by 1 degree box. If those tornadoes are equally spaced, they are separated by 33km in any E-W or N-S direction. That is an incredible density of distinct tornadoes/parent storms that doesn’t seem physically realistic, both in part because of what we know about how severe storms interact with one another on storm-scales and how environments are impacted by storms. Shouldn’t there be some physical limits to the number of tornadoes on any given day in an area? Extreme rainfall as a corollary does not have the same spatial constraints as a tornado – tornadoes only happen in one area of a storm whereas extreme rainfall is possible throughout (with preferential regions). So we can have side-by-side grid points/pixels with rainfall exceeding some return interval threshold but cannot assume that with tornadoes. So don’t we have a physical limit to the number of realized tornadoes for a given time period? I don’t think the authors can address this per se but I do think the authors should mention this in their concluding thoughts that this type of statistical technique may need to be adjusted to accommodate physical constraints that the physical imposes on tornado frequencies.

Comments, Questions, and Suggestions
Line 107: Are the 6-hourly periods the standard 00-06, 06-12 UTC, etc.? Or arbitrary?

Line 120: I assume you mean you interpolate GEFS data to the 1x1 degree boxes? Native, raw grid spacing for some variables is 0.25deg x 0.25deg.

Line 127: Do you also aggregate all reports over the same daily, 1200-1200 UTC period? This isn’t clear from previous paragraphs, which specifies a 6-hourly aggregation not 24 hours.

Lines 135-146: I think this section could use some rewording and/or word smithing. The reference to ‘second part’ of the index and equation two, which I originally thought was the second part of the index, is a bit confusing. I recommend adding that mu is computed across all daily maps to derive the PDF (as I understand it), which then allows you to randomly sample that PDF. The phrasing that you can generate random realizations of tornado occurrence based on “the same daily map” is inferring a random sample of the derived binomial PDF, but that is not explicit and can be confusing since mu doesn’t vary as a function of a single map (equation 2).

Line 145: When you say “populate random locations”, don’t you mean populate based on weighted locations? It’s not purely random if you have a probability field to weight where the sample of outbreak tornadoes (i.e., a pure count) should be placed.

Lines 178-193: “total U.S. tornadoes” appears to be a separate designation from “total U.S. outbreak tornadoes” – is this purposeful by the authors? I was under the assumption that mu is the expected number of outbreak tornadoes, but in Line 178 mu is referenced as the total U.S. tornadoes. Based on my understanding, all references to “total U.S. tornadoes” are really “total U.S. outbreak tornadoes” and if this is true, the text should be revised for clarity.

Figure 2: The dots displaying observed reports are rather small and get obscured by the contour lines surrounding them. Moreover, the colors denoting number of reports can only be seen when zooming into the manuscript .pdf at extreme percentages. The depiction of storm reports should be reimagined to better convey these observations. One recommendation is to contour only (i.e., no dots) to remove one overlapping piece of information. Alternatively, the color shading of number of reports could be used alone, although I would recommend a different color scheme so even single reports are visible (i.e., not near white shading).

One other recommendation for Figure 2 is to add the total number of observed reports in the top left along with the mu parameter. It would be a good piece of information to include for comparing expected report numbers to true observations on the figure itself without having to refer back to the text (line 223).
Line 225: The authors suggest that 1-day and 6-day forecasts are relatively skillful in predicting tornado outbreaks, but it would appear for this example case (Figure 2) that none of the observation locations verify as tornado outbreaks (i.e., > 6 tornadoes, Line 103); southwest Arkansas point perhaps has this criteria met but it’s not discernable from the data provided in the figure. So how do the authors arrive at this conclusion that these are skillful forecasts of tornado outbreak potential? I see a generally skillful forecast in tornado location, regardless of count, but I don’t believe that is what this outbreak parameter is truly identifying.

Figure 3: Recommend changing the color scheme so the smallest identified tornado per season value can still be visually seen – the light yellow blends into the white background so it is indistinguishable.

Figure 4: Can you add vertical dashed lines at the return period thresholds (nominally just 1, 10, 100, and 1000) so it’s easier to see the return period thresholds for the report/GEFS return curves, like you do in Figure 7? This could help when interpreting the graphics. (Same recommendation for Figure 5)

Technical Edits
Line 143: “total U.S. outbreak tornadoes”
Line 160: “ensemble member j index value at grid cell”
Line 162: “Frequencies” changed to “tornado frequency” – mismatch between singular “an” and plural “frequencies”
Line 175: “of the original”
Line 215: “the 14-day forecast lead”
Line 229: “from the observed event”
Line 231: “11+ day forecasts”
Line 241: “the sporadic, rare nature”
Line 246: “might explain increased mean” – this phrase needs an article (“the” increased mean, “an” increased mean)
Lines 383-385: See Das and Allen (2024) for hail return interval estimation: https://www.nature.com/articles/s44304-024-00052-5
Citation: https://doi.org/10.5194/egusphere-2025-3145-RC1
RC2:
'Comment on egusphere-2025-3145', Anonymous Referee #2, 17 Oct 2025
General comments
In this paper, the authors provide and analyze a set of tornado outbreaks synthetically generated by the GEFS. This allows for the estimation of rare events, such as 1-in-100-year and even 1-in-1000-year tornado outbreaks, enabling the assessment of the most extreme events. Moreover, teleconnection ENSO influence on tornado outbreak activity is investigated.
This work is highly valuable and well-structured. The main objectives are clear, the methodology is robust, and the results are clearly presented and thoroughly discussed. The conclusions are consistent with the findings presented in the paper. Only some clarifications are needed.
Therefore, I recommend accepting this manuscript after the authors address a few minor revisions.
Specific comments
Line 65: for which period have these “upward trends in tornado (outbreak) activity” been detected? Please, specify it.

Line 103: are you only taking into account CONUS tornadoes in outbreak definition, or all the USA? On the other hand, are you working with “tornado outbreak days”? (e.g., if > 6 tornadoes occur one day and > 6 the day after with no more than 6 hours between consecutive tornadoes, you consider it as one or two tornado outbreaks?). It should be clarified.

Line 107 and 114: why the 6-hourly resample and 1 deg. resolution? It would be grateful to justify it in the text.

Lines 27 to 29: in Brooks (2004) tornado path lengths and widths are compared to its F-scale rating, but no EF-scale. Please, change EF by F. Moreover, take into account that, as explained in that paper, “The mean width was reported prior to and including 1994 and the maximum width after 1994”.

Line 112-115: the use of NARR data is not clearly explained in the Data section. Please, consider adding a sentence here clarifying the specific purpose for which it is being used.

Line 120: the data aggregation for GEFS is as for NARR (the sum for CP and 6-h average for SRH and CAPE)? Which is the original temporal and spatial resolution for GEFS? It should be stated in the main text.

Technical corrections
Line 11: in the main text you refer to tornado intensity as F/EF. Please, replace EF/F1+ by F/EF1+ for consistency.

Line 25: it would be fine to provide a reference about F and EF scales, for example Fujita (1971) for F-scale and WSEC (2006) for EF-scale.

Fujita T.T. (1971): Proposed characterization of tornadoes and hurricanes by area and intensity. SMRP Research Paper, 91: 48.
WSEC, 2006. A Recommendation for an Enhanced Fujita Scale (EF-scale). http://www.spc.noaa.gov/faq/tornado/EFScale.pdf
Figure caption 3: (m) to (p) maps are GEFS extended (day 10-34 forecasts) index, not short-lead (day 1-9 forecasts) index.

Line 321: Fig. 9c-e does not exist (it is Fig. 9a-c)

Line 322: there is an extra space between “further into the extremes” and the dot.
Citation: https://doi.org/10.5194/egusphere-2025-3145-RC2

Kelsey Malloy and Michael K. Tippett

Viewed

Total article views: 1,512 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,440	52	20	1,512	29	28

HTML: 1,440
PDF: 52
XML: 20
Total: 1,512
BibTeX: 29
EndNote: 28

Views and downloads (calculated since 22 Jul 2025)

Month	HTML	PDF	XML	Total
Jul 2025	116	19	5	140
Aug 2025	286	14	6	306
Sep 2025	957	5	5	967
Oct 2025	53	8	3	64
Nov 2025	28	6	1	35

Cumulative views and downloads (calculated since 22 Jul 2025)

Month	HTML	PDF	XML	Total
Jul 2025	116	19	5	140
Aug 2025	286	14	6	306
Sep 2025	957	5	5	967
Oct 2025	53	8	3	64
Nov 2025	28	6	1	35

Viewed (geographical distribution)

Total article views: 1,587 (including HTML, PDF, and XML) Thereof 1,587 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 19 Nov 2025

Short summary

Tornado outbreaks—many tornadoes in short succession—have major impacts, but it is hard to accurately assess their risk because they are rare. We used weather model data to create hundreds of thousands of realistic but unseen tornado outbreak scenarios. With this event set, we estimated U.S. and local outbreak risk and detected clear links to La Niña and upward outbreak activity in recent years.


Total:	0
HTML:	0
PDF:	0
XML:	0