Preprints
https://doi.org/10.5194/egusphere-2025-4333
https://doi.org/10.5194/egusphere-2025-4333
09 Dec 2025
 | 09 Dec 2025
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

From Reanalysis to Climatology: Deep Learning Reconstruction of Tropical Cyclogenesis in the Western North Pacific

Duc-Trong Le, Tran-Binh Dang, Anh-Duc Hoang Gia, Duc-Hai Nguyen, Minh-Hoa Tien, Xuan-Truong Ngo, Quang-Trung Luu, Quang-Lap Luu, Tai-Hung Nguyen, Thanh T. N. Nguyen, and Chanh Kieu

Abstract. Tropical cyclogenesis (TCG) climatology is the key to understanding regional weather extremes and long-term cyclone risk, yet their large-scale environmental drivers remain difficult to characterize from observations or traditional physical-based modelling. In this study, we develop a deep learning (DL) framework, TCG-Net, based on an 18-layer residual convolutional neural network (ResNet-18) to reconstruct TCG climatology in the western North Pacific (WNP) basin from NASA’s Modern-Era Retrospective analysis for Research and Applications Version 2 (MERRA-2). The framework addresses two tasks: the Past Domain (PD) task that predicts when TCG occurs in the WNP within the next 48 hours, and the Dynamic Domain (DD) task that predicts the spatial distribution of  TCG at a given date and time. For each task, tailored labeling strategies define different negative samples to distinguish TCG from non-TCG conditions. To enhance the model's capability in handling the rarity of TCG data, temporal feature enrichment is used to incorporate environmental information from preceding 6-hour time steps, which helps improve the representation of each training task. In addition, random under-sampling (RUS) is applied with class weighting to address the severe imbalance caused by large numbers of negative TCG samples under these labeling strategies. With a training dataset from 1980–2016 and an independent set from 2017–2022, TCG-Net achieves an overall F1-score of 0.39 for the PD task and 0.33 for the DD task. In the PD task, feature selection experiments reveal that only a subset of environmental variables including vertical wind shear, low- to mid-level humidity, and mid-level vertical motion is required for robust performance, consistent with prior physical studies. In contrast, for the DD task, full-feature models perform better, likely due to their ability to exploit unknown or latent feature interactions. Both tasks reproduce key characteristics of the observed seasonality and spatial TCG distribution when evaluated against the best-track dataset. These results demonstrate that DL-based reconstructions, when coupled with task-specific labeling, temporal enrichment, and imbalance-aware training, can complement physics-based simulations and vortex-tracking algorithms and provide an efficient pathway for downscaling or projecting TCG climatology from coarse-resolution climate model outputs.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Duc-Trong Le, Tran-Binh Dang, Anh-Duc Hoang Gia, Duc-Hai Nguyen, Minh-Hoa Tien, Xuan-Truong Ngo, Quang-Trung Luu, Quang-Lap Luu, Tai-Hung Nguyen, Thanh T. N. Nguyen, and Chanh Kieu

Status: open (until 03 Feb 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Duc-Trong Le, Tran-Binh Dang, Anh-Duc Hoang Gia, Duc-Hai Nguyen, Minh-Hoa Tien, Xuan-Truong Ngo, Quang-Trung Luu, Quang-Lap Luu, Tai-Hung Nguyen, Thanh T. N. Nguyen, and Chanh Kieu
Duc-Trong Le, Tran-Binh Dang, Anh-Duc Hoang Gia, Duc-Hai Nguyen, Minh-Hoa Tien, Xuan-Truong Ngo, Quang-Trung Luu, Quang-Lap Luu, Tai-Hung Nguyen, Thanh T. N. Nguyen, and Chanh Kieu
Metrics will be available soon.
Latest update: 09 Dec 2025
Download
Short summary

We study how and where tropical storms begin in the western North Pacific. Using many years of global weather data and a modern pattern-recognition method, we built a model that learns signals that come before storm formation and maps when and where formation is likely. It reproduces known seasonal and regional patterns and identifies key environmental cues. These results can support better risk planning and help refine climate projections.

Share