the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Convolutional Neural Networks for Sea Surface Data Assimilation in Operational Ocean Models: Test Case in the Gulf of Mexico
Abstract. Deep learning models have demonstrated remarkable success in fields such as language processing and computer vision, routinely employed for tasks like language translation, image classification, and anomaly detection. Recent advancements in ocean sciences, particularly in data assimilation (DA), suggest that machine learning can emulate dynamical models, replace traditional DA steps to expedite processes, or serve as hybrid surrogate models to enhance forecasts. However, these studies often rely on ocean models of intermediate complexity, which involve significant simplifications that present challenges when transitioning to full-scale operational ocean models. This work explores the application of Convolutional Neural Networks (CNNs) in assimilating sea surface height and sea surface temperature data using the Hybrid Coordinate Ocean Model (HYCOM) in the Gulf of Mexico. The CNNs are trained to correct model errors from a two-year, high-resolution (1/25°) HYCOM dataset, assimilated using the Tendral Statistical Interpolation System (TSIS). We assess the performance of the CNNs across five controlled experiments, designed to provide insights into their application in environments governed by full primitive equations, real observations, and complex topographies. The experiments focus on evaluating: 1) the architecture and complexity of the CNNs, 2) the type and quantity of observations, 3) the type and number of assimilated fields, 4) the impact of training window size, and 5) the influence of coastal boundaries. Our findings reveal significant correlations between the chosen training window size—a factor not commonly examined—and the CNNs' ability to assimilate observations effectively. We also establish a clear link between the CNNs' architecture and complexity and their overall performance.
- Preprint
(4803 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1293', Anonymous Referee #1, 24 Jul 2024
In their work, the Authors address one very important initial step for making learned GCMs operational, as well as a way to accelerate a costly process in operational geosciences with traditional GCMs, namely the assimilation of observations into the operational framework. Given the chaotic nature of the system, any numerical model, learnt or not, requires updating with real observations, but exactly matching sparse and noisy real observations is a nonoptimal solution to the operational process.
My issues with the paper are both structural and scientific.
The authors remain unclear in the abstract as well as prior to the experiment phase as to the input and outputs of the deep learning models used.
Namely, they should clarify early on that they train a Deep Learning Architecture to reproduce the outputs of the T-SIS Assimilation, given the model forecast of SST and SSH and the simulated (? that has stayed unclear) satellite observations of SST and SSH. I sincerely hope that this is what is happening, since it remain unclear to me. I searched through the manuscript for an explanation of the observations, and it is nowhere to be found.
It is implied this is a twin experiment, but it really warrants clarification.Similarly the assimilation step seems to be daily, but the assimilated field is complete, which is wildly unrealistic given the fields used as inputs
As to the rest of the article, the techniques used are now ~10 year old approaches, lacking a lot of significant improvements from the architectural side, but more importantly, on the implementation side, there are lots of elements missing:- The pre-processing is not detailed. Given the different value ranges of the input variables, it is crucial to perform a normalization beforehand.
- Missing values: how are the missing values handled? The lack of mention of missing values seems to indicate a twin experiment, and even then it is unrealistic given the nature of the data.
- There is no care taken for data leakage. There are no dates dropped between train validation and test.
- The layers do not include any form of regularisation: Dropout, AdamW with a heavier weight loss penalization of (my preference) Batch Normalization
- Newer techniques of deep learning are not addressed: CBAM layers in Sma-at U-Nets, or Masked Autoencoder ViT, or Denoising Diffusion Inpainting which are known to outperform U-Nets which tend to smooth out the output field
- The network in all its configurations is not provided with any information on latitude or longitude, therefore preventing it from knowing contextually the Coriolis force
As far as the experiments are concerned, the presentation and analysis of the multiple base CNNs which are not really in use nowadays for these types of problems do not seem useful. Running this experiment with U-Nets that learn over different patches could be interesting, potentially.
The results however are encouraging and should the Authors significantly expand and clarify their paper, I would consider it a worthy contribution to the field.
Minor comments:
Section 2.2 would benefit from a quick bibliographical referencing of some of the many U-Nets and Sma-at U-Nets applied in a multitude of geoscience problems over the last 10 years.
Consider flipping table to horizontally into a two column paradigm so as to not imply linewise combinations of parameters
Citation: https://doi.org/10.5194/egusphere-2024-1293-RC1 -
RC2: 'Comment on egusphere-2024-1293', Michael Gray, 27 Jul 2024
General Comments
This work explores the ability of Convolutional Neural Networks (CNNs) to serve as a surrogate to the Tendral Statistical Interpolation System (TSIS) method of data assimilation between the Hybrid Coordinate Ocean Model (HYCOM) and observations. In addition to evaluating the performance of their models in the Gulf of Mexico, the authors also quantify the difference in skill of various hyperparameters and model architectures in this highly dynamic region. The results of this paper show the technique is sound and has potential to be used operationally.
While the study is well conducted and the results are relevant to the community, the report will benefit from more specificity in the model architectures and data preprocessing. All such information is discernible in the code repository provided but should be expressed in the report. This includes:
- Entire preprocessing procedure
- The handling of land points (e.g. masked as 0)
- Structure of tensors used as input
- Use of batch normalization
- Point where normalization/denormalization occurs
- Structure of output tensors
- Which parameters were tuned via hyper parameter optimization
Finally, there should be some comparison to other models/techniques used for similar purposes in geosciences. Much has been written on this topic in the atmospheric sciences. How could this study be extended to newer, more sophisticated model architectures?
Specific Comments
Line 80: The “Markov process” mentioned is not described or referenced. Presumably, modelers and computational scientists will be familiar with the meaning, but it wouldn’t hurt to briefly describe this meaning.
Line 98: Similar to above; Gaussian Moarkov Random Field not explained or referenced.
Sect. 2.2: If CNNs are going to be explained in this detail, it would help to show a figure differentiating traditional CNNs from UNets/encoder-decoder networks since the difference is hard to visualize. UNets are referenced, CNNs are not. There should be consistency in the degree of explanations in this section.
Line 156: “hindcast” is used to describe the training data (I assume). There is no mention of performing the TSIS technique with awareness of time, so can this be called a hindcast?
Line 150: I am not familiar with the use of the word “innovations” here and in the following line. Could you rephrase?
Line 291: Why this day specifically? Whether it was randomly selected or chosen based on best/worst performance should be noted.
Line 347: Units should be placed on “58” and “22”. It is unclear that these are factors until reading the table.
Technical Comments
Throughout the paper, citations are not delineated from the clause that proceeds them (i.e. separated with a comma or wrapped in parentheses). Some examples: Line 48-49, Line 117-118, Line 126, Line 144, Line 153, Line 219,
Line 114: “coming” à “coming”
Line 125: “U-net” should be plural (?)
Line 156: “test” à “testing”
Line 197: “1[h]”?
Citation: https://doi.org/10.5194/egusphere-2024-1293-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
214 | 112 | 32 | 358 | 28 | 22 |
- HTML: 214
- PDF: 112
- XML: 32
- Total: 358
- BibTeX: 28
- EndNote: 22
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1