Investigating the drivers of wintertime Southern Ocean sea-ice leads using random forest algorithms
Abstract. Sea-ice leads play a crucial role in regulating ocean-atmosphere energy exchange, yet the physical drivers controlling their variability across the Southern Ocean remain unquantified. This study uses a machine learning approach based on random forest regression with permutation importance analysis to identify predictors of Southern Ocean lead frequency during winters (April–September, 2003–2023). The model integrates nine predictors representing atmospheric (wind speed, wind divergence, sea-level pressure, 2 m temperature), oceanic (surface current speed), and sea-ice kinematic variables (ice velocity, divergence, concentration), together with a seasonal descriptor (month). Evaluated on independent test data, the model achieves an evaluation performance correlation of r = 0.70 at the pan-Antarctic scale and r = 0.68–0.78 across regional sectors (MAE = 0.016–0.024). Permutation analysis indicates that 2 m temperature (20 %), wind divergence (13 %), ice divergence (11.9 %), and ocean current speed (11.6 %) collectively explain approximately 57 % of the observed lead frequency variability. Regional analysis reveals sector-specific drivers: The Weddell Sea is controlled by wind and ice divergence; the Ross Sea exhibits contributions from air temperature, wind divergence, and ocean current. The Indian and Pacific Ocean sectors show strong air temperature and ocean current influence, and the Bellingshausen–Amundsen Seas are dominated by seasonal wind forcing. However, the model does not fully resolve fine-scale structures evident in observations, hence a notable portion of the lead frequency variance remains unexplained due to the spatial resolution used in this study. This suggests the need for future work to apply a random forest framework at higher spatial resolution to investigate small-scale regional lead hotspots, including bathymetrically-controlled and coastal lead zones.