Preprints
https://doi.org/10.48550/arXiv.2504.05218
https://doi.org/10.48550/arXiv.2504.05218
25 Apr 2025
 | 25 Apr 2025
Status: this preprint is open for discussion and under review for Biogeosciences (BG).

Hybrid machine learning data assimilation for marine biogeochemistry

Ieuan Higgs, Ross Bannister, Jozef Skákala, Alberto Carrassi, and Stefano Ciavatta

Abstract. Marine biogeochemistry models are critical for forecasting, as well as estimating ecosystem responses to climate change and human activities. Data assimilation (DA) improves these models by aligning them with real-world observations, but marine biogeochemistry DA faces challenges due to model complexity, strong nonlinearity, and sparse, uncertain observations. Existing DA methods applied to marine biogeochemistry struggle to update unobserved variables effectively, while ensemble-based methods are computationally too expensive for high-complexity marine biogeochemistry models. This study demonstrates how machine learning (ML) can improve marine biogeochemistry DA by learning statistical relationships between observed and unobserved variables. We integrate ML-driven balancing schemes into a 1D prototype of a system used to forecast marine biogeochemistry in the North-West European Shelf seas. ML is applied to predict (i) state-dependent correlations from free-run ensembles and (ii), in an ``end-to-end'' fashion, analysis increments from an Ensemble Kalman Filter. Our results show that ML significantly enhances updates for previously not-updated variables when compared to univariate schemes akin to those used operationally. Furthermore, ML models exhibit moderate transferability to new locations, a crucial step toward scaling these methods to 3D operational systems. We conclude that ML offers a clear pathway to overcome current computational bottlenecks in marine biogeochemistry DA and that refining transferability, optimizing training data sampling, and evaluating scalability for large-scale marine forecasting, should be future research priorities.

Share
Ieuan Higgs, Ross Bannister, Jozef Skákala, Alberto Carrassi, and Stefano Ciavatta

Status: open (until 06 Jun 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Ieuan Higgs, Ross Bannister, Jozef Skákala, Alberto Carrassi, and Stefano Ciavatta
Ieuan Higgs, Ross Bannister, Jozef Skákala, Alberto Carrassi, and Stefano Ciavatta

Viewed

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 63 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
63 0 0 63 0 0
  • HTML: 63
  • PDF: 0
  • XML: 0
  • Total: 63
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 25 Apr 2025)
Cumulative views and downloads (calculated since 25 Apr 2025)

Viewed (geographical distribution)

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 62 (including HTML, PDF, and XML) Thereof 62 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 May 2025
Download
Short summary
We explored how machine learning can improve computer models that simulate ocean ecosystems. These models help us understand how the ocean works, but they often struggle due to limited observations and complex processes. Our approach uses machine learning to better connect the parts of the system we can observe with those we cannot. This leads to more accurate and efficient predictions, offering a promising way to improve future ocean monitoring and forecasting tools.
Share