Global Sub-national Impact-based Forecasting for Tropical Cyclones Using Open Data: Combining Machine Learning and Exposure-based Approaches
Abstract. Tropical cyclones (TCs) cause substantial and uneven impacts across regions, driven by differences in exposure and vulnerability. While anticipatory action (AA) systems aim to mitigate these impacts, they are typically based on hazard thresholds rather than predicted consequences, limiting their effectiveness and consistency. Impact-based forecasting offers a promising alternative, but existing approaches are often region-specific or rely on non-transferable data. In this study, we develop a global, sub-national impact-based forecasting framework that predicts affected-population fractions using only openly available data. The model integrates hazard, exposure, and contextual features within a two-stage XGBoost architecture and is evaluated across 780 historical TC events using decision-relevant metrics aligned with operational thresholds. Our results show that machine learning improves the detection and spatial localization of impacts, but does not outperform simpler exposure-based approaches in identifying severe events. This reveals a fundamental trade-off between coverage and conservative severity detection, suggesting that hybrid strategies combining both approaches are better suited for operational use. We position this system as a first-generation global benchmark for impact-based forecasting: it demonstrates the feasibility of transferable, sub-national predictions using open data, while clarifying the limitations that must be addressed for reliable deployment in anticipatory action systems.