Using Data Assimilation to Improve Data-Driven Models
Abstract. Data-driven models (DDMs) are developed by analysing extensive datasets to detect patterns and make predictions, without relying on predefined rules or instructions from humans. In fields like numerical weather prediction (NWP), DDMs are gaining popularity as potential replacements for traditional numerical models, thanks to their grounding in a multi-decadal, high-quality data assimilation (DA) analysis product. Recent studies, such as Lang et al. (2024), have demonstrated that training DDMs using the ERA5 (the fifth generation European Centre for Medium-range Weather Forecast atmospheric reanalysis) can outperform traditional numerical models. DA integrates observations from various sources with numerical models to enhance the accuracy of model state estimates and predictions or simulations of a system's behaviour. Due to the benefits of DDMs and DA, integration of these methods has been gaining traction in a wide range of fields.
This paper focuses on the application of DA methodologies in enhancing the precision and efficiency of DDM generation. The aim is to demonstrate the pivotal role that DA can play in refining and optimising the process of DDM generation by incorporating various observation data directly, augmenting the accuracy and reliability of predictive models despite the presence of observational uncertainties. This study shows how DDMs can improve on imperfect model forecasting, and in conjunction with DA, can cyclically generate more accurate training data, further enhancing the precision of DDMs.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Nonlinear Processes in Geophysics.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.