An interpretable machine learning for marine heatwave prediction for the south China sea
Abstract. A primary challenge of machine learning to predict marine heatwave (MHW) for the south China sea (SCS) is the limited availability of observational data for model training. To address this issue, this study explores the viability of leveraging multi-member ensemble simulations from the Coupled Model Intercomparison Project Phase 6 (CMIP6), to construct an extensive, physically consistent training dataset for various machine learning models. After training on multiple CMIP6 ensemble members, the constructed models are evaluated for their predictive capacity regarding MHW in the SCS. The results also show that these machine learning-based methods can perform comparably to the existing dynamic models in terms of prediction performance, and in some cases even outperform the latter. Furthermore, by incorporating machine learning interpretability techniques, the key physical processes can also be elucidated from these predictions. That is to say, the new method is not a traditional "black box", but rather an effective tool that can possess certain physical transparency and scientific interpretability.