Preprints
https://doi.org/10.5194/egusphere-2025-933
https://doi.org/10.5194/egusphere-2025-933
12 Mar 2025
 | 12 Mar 2025

Using Data Assimilation to Improve Data-Driven Models

Michael Goodliff and Takemasa Miyoshi

Abstract. Data-driven models (DDMs) are developed by analysing extensive datasets to detect patterns and make predictions, without relying on predefined rules or instructions from humans. In fields like numerical weather prediction (NWP), DDMs are gaining popularity as potential replacements for traditional numerical models, thanks to their grounding in a multi-decadal, high-quality data assimilation (DA) analysis product. Recent studies, such as Lang et al. (2024), have demonstrated that training DDMs using the ERA5 (the fifth generation European Centre for Medium-range Weather Forecast atmospheric reanalysis) can outperform traditional numerical models. DA integrates observations from various sources with numerical models to enhance the accuracy of model state estimates and predictions or simulations of a system's behaviour. Due to the benefits of DDMs and DA, integration of these methods has been gaining traction in a wide range of fields.

This paper focuses on the application of DA methodologies in enhancing the precision and efficiency of DDM generation. The aim is to demonstrate the pivotal role that DA can play in refining and optimising the process of DDM generation by incorporating various observation data directly, augmenting the accuracy and reliability of predictive models despite the presence of observational uncertainties. This study shows how DDMs can improve on imperfect model forecasting, and in conjunction with DA, can cyclically generate more accurate training data, further enhancing the precision of DDMs.

Competing interests: At least one of the (co-)authors is a member of the editorial board of Nonlinear Processes in Geophysics.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Michael Goodliff and Takemasa Miyoshi

Status: closed (peer review stopped)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2025-933', Zheqi Shen, 22 Mar 2025
  • RC1: 'Comment on egusphere-2025-933', Marc Bocquet, 03 Apr 2025
  • EC1: 'Comment on egusphere-2025-933', Jie Feng, 15 Apr 2025
  • RC2: 'Comment on egusphere-2025-933', Anonymous Referee #2, 19 Apr 2025
  • EC2: 'Comment on egusphere-2025-933', Jie Feng, 24 Apr 2025

Status: closed (peer review stopped)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2025-933', Zheqi Shen, 22 Mar 2025
  • RC1: 'Comment on egusphere-2025-933', Marc Bocquet, 03 Apr 2025
  • EC1: 'Comment on egusphere-2025-933', Jie Feng, 15 Apr 2025
  • RC2: 'Comment on egusphere-2025-933', Anonymous Referee #2, 19 Apr 2025
  • EC2: 'Comment on egusphere-2025-933', Jie Feng, 24 Apr 2025
Michael Goodliff and Takemasa Miyoshi
Michael Goodliff and Takemasa Miyoshi

Viewed

Total article views: 344 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
259 70 15 344 15 19
  • HTML: 259
  • PDF: 70
  • XML: 15
  • Total: 344
  • BibTeX: 15
  • EndNote: 19
Views and downloads (calculated since 12 Mar 2025)
Cumulative views and downloads (calculated since 12 Mar 2025)

Viewed (geographical distribution)

Total article views: 352 (including HTML, PDF, and XML) Thereof 352 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 23 Jun 2025
Download
Short summary
Data-driven models (DDMs) learn from large datasets to make predictions, but data limitations affect reliability. Data assimilation (DA) improves accuracy by combining real-world observations with computational models. This research explores how DA enhances DDMs despite limited data. We propose an algorithm using DA to refine DDM training iteratively. This work has broad implications for fields like meteorology, engineering, and environmental science, where accurate prediction is critical.
Share