Convolutional Neural Networks for Sea Surface Data Assimilation in Operational Ocean Models: Test Case in the Gulf of Mexico

Zavala-Romero, Olmo; Bozec, Alexandra; Chassignet, Eric P.; Miranda, Jose R.

doi:https://doi.org/10.5194/egusphere-2024-1293

Preprints

https://doi.org/10.5194/egusphere-2024-1293

Preprints

08 May 2024

| 08 May 2024

Status: this preprint is open for discussion.

Convolutional Neural Networks for Sea Surface Data Assimilation in Operational Ocean Models: Test Case in the Gulf of Mexico

Olmo Zavala-Romero, Alexandra Bozec, Eric P. Chassignet, and Jose R. Miranda

Abstract. Deep learning models have demonstrated remarkable success in fields such as language processing and computer vision, routinely employed for tasks like language translation, image classification, and anomaly detection. Recent advancements in ocean sciences, particularly in data assimilation (DA), suggest that machine learning can emulate dynamical models, replace traditional DA steps to expedite processes, or serve as hybrid surrogate models to enhance forecasts. However, these studies often rely on ocean models of intermediate complexity, which involve significant simplifications that present challenges when transitioning to full-scale operational ocean models. This work explores the application of Convolutional Neural Networks (CNNs) in assimilating sea surface height and sea surface temperature data using the Hybrid Coordinate Ocean Model (HYCOM) in the Gulf of Mexico. The CNNs are trained to correct model errors from a two-year, high-resolution (1/25°) HYCOM dataset, assimilated using the Tendral Statistical Interpolation System (TSIS). We assess the performance of the CNNs across five controlled experiments, designed to provide insights into their application in environments governed by full primitive equations, real observations, and complex topographies. The experiments focus on evaluating: 1) the architecture and complexity of the CNNs, 2) the type and quantity of observations, 3) the type and number of assimilated fields, 4) the impact of training window size, and 5) the influence of coastal boundaries. Our findings reveal significant correlations between the chosen training window size—a factor not commonly examined—and the CNNs' ability to assimilate observations effectively. We also establish a clear link between the CNNs' architecture and complexity and their overall performance.

Received: 01 May 2024 – Discussion started: 08 May 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Olmo Zavala-Romero, Alexandra Bozec, Eric P. Chassignet, and Jose R. Miranda

Status: open (extended)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2024-1293', Anonymous Referee #1, 24 Jul 2024 reply
In their work, the Authors address one very important initial step for making learned GCMs operational, as well as a way to accelerate a costly process in operational geosciences with traditional GCMs, namely the assimilation of observations into the operational framework. Given the chaotic nature of the system, any numerical model, learnt or not, requires updating with real observations, but exactly matching sparse and noisy real observations is a nonoptimal solution to the operational process.

My issues with the paper are both structural and scientific.

The authors remain unclear in the abstract as well as prior to the experiment phase as to the input and outputs of the deep learning models used.

Namely, they should clarify early on that they train a Deep Learning Architecture to reproduce the outputs of the T-SIS Assimilation, given the model forecast of SST and SSH and the simulated (? that has stayed unclear) satellite observations of SST and SSH. I sincerely hope that this is what is happening, since it remain unclear to me. I searched through the manuscript for an explanation of the observations, and it is nowhere to be found.

It is implied this is a twin experiment, but it really warrants clarification.

Similarly the assimilation step seems to be daily, but the assimilated field is complete, which is wildly unrealistic given the fields used as inputs

As to the rest of the article, the techniques used are now ~10 year old approaches, lacking a lot of significant improvements from the architectural side, but more importantly, on the implementation side, there are lots of elements missing:

The pre-processing is not detailed. Given the different value ranges of the input variables, it is crucial to perform a normalization beforehand.

Missing values: how are the missing values handled? The lack of mention of missing values seems to indicate a twin experiment, and even then it is unrealistic given the nature of the data.

There is no care taken for data leakage. There are no dates dropped between train validation and test.

The layers do not include any form of regularisation: Dropout, AdamW with a heavier weight loss penalization of (my preference) Batch Normalization

Newer techniques of deep learning are not addressed: CBAM layers in Sma-at U-Nets, or Masked Autoencoder ViT, or Denoising Diffusion Inpainting which are known to outperform U-Nets which tend to smooth out the output field

The network in all its configurations is not provided with any information on latitude or longitude, therefore preventing it from knowing contextually the Coriolis force

As far as the experiments are concerned, the presentation and analysis of the multiple base CNNs which are not really in use nowadays for these types of problems do not seem useful. Running this experiment with U-Nets that learn over different patches could be interesting, potentially.
The results however are encouraging and should the Authors significantly expand and clarify their paper, I would consider it a worthy contribution to the field.

Minor comments:
Section 2.2 would benefit from a quick bibliographical referencing of some of the many U-Nets and Sma-at U-Nets applied in a multitude of geoscience problems over the last 10 years.

Consider flipping table to horizontally into a two column paradigm so as to not imply linewise combinations of parameters

Reply
Citation: https://doi.org/10.5194/egusphere-2024-1293-RC1

Olmo Zavala-Romero, Alexandra Bozec, Eric P. Chassignet, and Jose R. Miranda

Viewed

Total article views: 234 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
139	76	19	234	16	15

HTML: 139
PDF: 76
XML: 19
Total: 234
BibTeX: 16
EndNote: 15

Views and downloads (calculated since 08 May 2024)

Month	HTML	PDF	XML	Total
May 2024	86	51	6	143
Jun 2024	28	9	2	39
Jul 2024	25	16	11	52

Cumulative views and downloads (calculated since 08 May 2024)

Month	HTML	PDF	XML	Total
May 2024	86	51	6	143
Jun 2024	28	9	2	39
Jul 2024	25	16	11	52

Viewed (geographical distribution)

Total article views: 243 (including HTML, PDF, and XML) Thereof 243 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Jul 2024

Short summary

Deep learning is enhancing ocean science by improving data processing and forecasts. This study uses Convolutional Neural Networks (CNNs) to assimilate sea surface data in the Gulf of Mexico. Researchers conducted five experiments to evaluate the CNNs' performance across different designs and data types, revealing how training data volume and CNN design affect their effectiveness in operational ocean modeling.


Total:	0
HTML:	0
PDF:	0
XML:	0