the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
SPIN (v1.0): A Spontaneous Synthetic Tropical Cyclone Model Empowered by NeuralGCM for Hazard Assessment
Abstract. A hybrid framework for simulating SPontaneous synthetic tropical cyclones (TCs) with realistic INtensity, hereafter SPIN, is developed for TC risk assessment. A key advantage of SPIN over previous synthetic TC models is that it avoids the assumption of independence between TCs, while enabling two-way interactions between synthetic TCs and their ambient environment. The SPIN model leverages a Neural General Circulation Model (NeuralGCM) to simulate spontaneously generated TC tracks, and then couples a dynamic TC intensity model to estimate their intensity evolutions based on the large-scale environment. SPIN reproduces the observed climatology of TC activity, including interannual variability, seasonal cycle, genesis, tracks, and lifetime maximum intensity distributions. It also faithfully reproduces the observed return periods of landfall intensity across different regions, enabling its future application to TC risk assessment. Beyond individual TC events, SPIN demonstrates improved skills in representing multiple tropical cyclone events (MTCEs), including their interannual variability, peak concurrent TC count per cluster, and the spatial relationship between consecutive TCs. By circumventing the independent TC assumption and allowing for two-way TC-environment interactions, SPIN opens new potential for assessing compound hazards like MTCE and beyond.
- Preprint
(3590 KB) - Metadata XML
-
Supplement
(675 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-5540', Anonymous Referee #1, 31 Mar 2026
-
RC2: 'Comment on egusphere-2025-5540', Anonymous Referee #2, 31 May 2026
Recommendation: Major Revisions
Summary
This paper introduces SPIN (v1.0), a hybrid framework for synthetic tropical cyclone (TC) simulation that combines the NeuralGCM atmospheric model for spontaneous TC track generation with the FAST intensity model to produce realistic maximum sustained winds. The main claimed advantages over existing statistical downscaling models are: (1) spontaneous TC genesis without manual seeding, (2) two-way interactions between TCs and their environment, and (3) improved representation of multiple tropical cyclone events (MTCEs). The model is evaluated against IBTrACS observations and the JL23 statistical downscaling benchmark across a broad range of TC metrics, including interannual variability, seasonal cycle, genesis and track density, lifetime maximum intensity (LMI), landfall frequency, and MTCE characteristics.
SPIN fills a real gap in the TC hazard modeling literature by providing a computationally efficient framework that avoids the independence assumption of statistical models. The code and data are made openly available on Zenodo. However, several central claims require significant improvements before the paper is suitable for publication.
General Comments
The term "downscaling" is used throughout the introduction without explicit definition, which may create ambiguity for readers. In mainstream climatology, "downscaling" typically refers to the spatial refinement of gridded meteorological fields (e.g., temperature, precipitation, wind) from coarse to fine resolution, either through dynamical methods (regional climate models) or statistical approaches (bias correction, regression-based methods). In the tropical cyclone hazard community, however, "downscaling" — or more precisely "statistical-dynamical downscaling" — carries a different and more specific meaning: the generation of synthetic TC tracks and intensities from large-scale environmental fields, without necessarily producing spatially explicit high-resolution wind fields. These two usages are sufficiently different that conflating them risks misleading readers outside the TC hazard community. More broadly, it would be helpful for the authors to better acknowledge the diversity of approaches that exist for generating synthetic TC tracks.
The description of NeuralGCM's performance is not fully demonstrated with respect to TC representation (as it is the case for ECMWF models for instance). Several important questions are left unaddressed that are directly relevant to the reliability of the SPIN framework: How long is the model rollout stable? Are there any known biases? Are all simulated synthetic tracks physically plausible, or can NeuralGCM produce spurious vortex-like features (hallucinations) that may be incorrectly identified as TCs by TempestExtremes? What do extreme TCs look like in NeuralGCM outputs (10m wind speed, sea level pressure structure)? What initial conditions are used, and how are observed SSTs incorporated? These aspects are critical to assess the physical realism of the TC catalog.
Note sure to fully understand why SPIN allows "two-way interactions" between synthetic TC and their environment ? FAST computes TC intensity independently along each NeuralGCM-derived track without any feedback into the atmospheric fields. Also, SPIN's improved MTCE representation is attributed to these interactions which likely need more evidences / justifications.
Â
Specific Comments
- Section 2.2.1
Could the authors elaborate on how the NeuralGCM ensemble simulations were conducted? Why were 14 members chosen specifically? The statement that "the 1.4° version produces reliable TC activity" is vague for a claim that underpins the entire study. Did the authors perform substantial checks to verify that NeuralGCM reliably simulates TCs across the full 1980-2022 period? What is the spread (in tracks representation) between ensemble members for an identical simulation year?
- Section 2.2.2
The authors state that TempestExtremes parameters are fine-tuned separately for each basin to align simulated annual average track counts and storm lifetimes with IBTrACS. Could the authors elaborate on this procedure? Please also specify the total number of simulated years and clarify the link with Figure 14.
- Section 3.1
The authors note that interannual correlations could be improved by initializing NeuralGCM in April rather than mid-October, yet this configuration was not adopted. The rationale for choosing October initialization should be explicitly justified.
- Section 3.1
The proposed explanation for SPIN's improved interannual skill in mixed-ENSO-signal regions is physically plausible but remains qualitative and unverified. The authors should support this claim more explicitly, for instance by showing the spatial distribution of ENSO influence on TC genesis, and by demonstrating that SPIN's improvement over JL23 is indeed concentrated in regions with mixed-sign ENSO responses.
- Section 3.1
The tendency of SPIN to overestimate TC genesis in the early-to-peak season and underestimate it toward the end of the season is acknowledged but not investigated. Given that this bias is attributed to NeuralGCM, the authors should provide a more thorough discussion of its likely physical origin — whether related to biases in large-scale circulation timing, smoothed monthly SST forcing, or misrepresentation of TC precursors.
- Section 3.2
The paragraph describing the track density results (lines 234-238) is repeated at lines 244-249.
- Section 3.3
The tuning procedure for Ck is not described. The authors should specify which dataset was used and how Ck was calibrated, to ensure the subsequent intensity evaluation in Sections 3.3 and 4 is not a possible artifact of parameter fitting.
- Section 3.5
The authors do not discuss the physical mechanisms behind SPIN's improved representation of inter-genesis times.
- Section 3.6
Line 360: "Figure 3" in place of "Figure 11."
- Section 3.6
The authors attribute SPIN's improved MTCE representation to two-way TC-environment interactions. Would a "SPIN-uniform" experiment, in which NeuralGCM tracks are retained but genesis dates are resampled uniformly within each month, help isolate the contribution of physical TC-environment interactions from that of temporal resolution alone? The aspect is quite important as it is presented as one of the major improvement of SPIN model compared to existing TCs tracks models.
Â
Citation: https://doi.org/10.5194/egusphere-2025-5540-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,208 | 765 | 92 | 2,065 | 179 | 110 | 137 |
- HTML: 1,208
- PDF: 765
- XML: 92
- Total: 2,065
- Supplement: 179
- BibTeX: 110
- EndNote: 137
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript describes a new approach for generating synthetic tropical cyclone (TC) event sets for risk assessment. Specifically, it leverages the newly developed AI/ML model NeuralGCM and the existing statistical-dynamical TC intensity model FAST, named SPontaneous synthetic TC with realistic INtensity (SPIN). The authors show that SPIN has an advantage over conventional statistical-dynamical downscaling models by enabling two-way interactions — TCs are not just forced by their environments but now feed back to them — and claim that the model has improved skill in predicting multiple TC events (MTCs). The idea of combining an AI/ML weather model like NeuralGCM with FAST is novel; thus, I think the manuscript should be published. However, I have some minor questions on some of the details, especially the discussions around MTCs. Below is a list of my comments/questions.
(1) First, the authors argue that SPIN enables two-way interaction between TC and environment, which in my opinion is only partially true. Storm intensity in SPIN is post-processed using FAST, so it does not really 'feedback' to NeuralGCM's environment. As a result, MTCs in SPIN do not really reflect true storm-to-storm interaction. Please add a couple sentences of this limitation.
(2) Could you elaborate on how the simulations were conducted? Are the 14 ensemble members’ simulations initialized 6 hours apart from each other? Can these simulations be considered SST-forced runs, meaning that after the initial time, the only input from ERA5 is the monthly SST and SIC? And there is a 2.5-month spin-up period, am I correct and is this necessary?
(3) I am not really following the argument here — it basically says that NeuralGCM better captures the ENSO modulation of TCs, but both the JL models and other models (Lin et al. 2024; Lee et al. 2025) show that they can simulate ENSO modulation of TCs as well. A clearer demonstration would be a direct comparison of interannual TC frequency or intensity anomalies conditioned on ENSO phase across models.
Lee, C., S. J. Camargo, C. Francis, C. Karamperidou, and C. M. Patricola-DiRosario, 2025: Climate Change Impact on the ENSO–TC Relationship in CMIP6: Synthetic TC Analysis. J. Climate,  38, 5595–5614, https://doi.org/10.1175/JCLI-D-24-0662.1.
Jonathan Lin, Chia-Ying Lee, Suzana Camargo et al. The Response of Tropical Cyclone Hazard to Natural and Forced Warming Patterns, 21 October 2024, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-5248169/v1]
(4) Also, regarding Line 190, one assumption of the random-seeding approach in the JL model is that the genesis process is simply part of intensification. However, your argument seems to suggest that this assumption may not hold for interannual variability. Could you further discuss this, and whether approaches that use genesis indices would be a better way to handle interannual variability?
(5) Figure 8. The area definition is not precise. It is not just the eastern US — you also include the Gulf of Mexico, which includes the southern US.
(6) How did you handle extratropical transition (ET) storms? If you simply run FAST all the way to the mid-latitudes, you are likely to overestimate storm intensity and introduce a positive bias in the number of MTCs.
(7) Line 300. You may need to check with JL23 for details — I think most existing statistical-dynamical downscaling models can provide date information. It may stop at monthly resolution because the input is monthly data, and thus 'daily' information is simply artificially generated due to the seeding rate. However, they do have date information, and MTCs will exist when the monthly environmental conditions are more favorable than in other months. So in a way, is this not similar to your approach of using monthly SST input.
(8) L330. In SPIN, when you apply NeuralGCM output to FAST, do you use instantaneous output or monthly averaged fields? Also, do you have any idea why JL23-ERA5 performs better than JC23-NeuralGCM?
(9) Can you show me the sample errors of your MTCE analysis, like those in Figure 10?
(10) Figure 11: Do STCs forming in these quadrants have a geographic or seasonal preference?