Preprints
https://doi.org/10.5194/egusphere-2025-933
https://doi.org/10.5194/egusphere-2025-933
12 Mar 2025
 | 12 Mar 2025
Status: this preprint is open for discussion and under review for Nonlinear Processes in Geophysics (NPG).

Using Data Assimilation to Improve Data-Driven Models

Michael Goodliff and Takemasa Miyoshi

Abstract. Data-driven models (DDMs) are developed by analysing extensive datasets to detect patterns and make predictions, without relying on predefined rules or instructions from humans. In fields like numerical weather prediction (NWP), DDMs are gaining popularity as potential replacements for traditional numerical models, thanks to their grounding in a multi-decadal, high-quality data assimilation (DA) analysis product. Recent studies, such as Lang et al. (2024), have demonstrated that training DDMs using the ERA5 (the fifth generation European Centre for Medium-range Weather Forecast atmospheric reanalysis) can outperform traditional numerical models. DA integrates observations from various sources with numerical models to enhance the accuracy of model state estimates and predictions or simulations of a system's behaviour. Due to the benefits of DDMs and DA, integration of these methods has been gaining traction in a wide range of fields.

This paper focuses on the application of DA methodologies in enhancing the precision and efficiency of DDM generation. The aim is to demonstrate the pivotal role that DA can play in refining and optimising the process of DDM generation by incorporating various observation data directly, augmenting the accuracy and reliability of predictive models despite the presence of observational uncertainties. This study shows how DDMs can improve on imperfect model forecasting, and in conjunction with DA, can cyclically generate more accurate training data, further enhancing the precision of DDMs.

Competing interests: At least one of the (co-)authors is a member of the editorial board of Nonlinear Processes in Geophysics.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Michael Goodliff and Takemasa Miyoshi

Status: open (until 07 May 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Michael Goodliff and Takemasa Miyoshi
Michael Goodliff and Takemasa Miyoshi

Viewed

Total article views: 44 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
29 12 3 44 1 1
  • HTML: 29
  • PDF: 12
  • XML: 3
  • Total: 44
  • BibTeX: 1
  • EndNote: 1
Views and downloads (calculated since 12 Mar 2025)
Cumulative views and downloads (calculated since 12 Mar 2025)

Viewed (geographical distribution)

Total article views: 55 (including HTML, PDF, and XML) Thereof 55 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 18 Mar 2025
Download
Short summary
Data-driven models (DDMs) learn from large datasets to make predictions, but data limitations affect reliability. Data assimilation (DA) improves accuracy by combining real-world observations with computational models. This research explores how DA enhances DDMs despite limited data. We propose an algorithm using DA to refine DDM training iteratively. This work has broad implications for fields like meteorology, engineering, and environmental science, where accurate prediction is critical.
Share