the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Comparison of 4-Dimensional Variational and Ensemble Optimal Interpolation data assimilation systems using a Regional Ocean Modelling System (v3.4) configuration of the eddy-dominated East Australian Current System
Abstract. Ocean models must be regularly updated through the assimilation of observations (data assimilation) in order to correctly represent the timing and locations of eddies. Since initial conditions play an important role in the quality of short-term ocean forecasts, an effective data assimilation scheme to produce accurate state estimates is key to improving prediction. Western boundary current regions, such as the East Australia Current system, are highly variable regions making them particularly challenging to model and predict. This study assesses the performance of two ocean data assimilation systems in the East Australian Current system over a two year period. We compare the time-dependent 4-Dimensional Variational (4D-Var) data assimilation system with the more computationally-efficient, time-independent Ensemble Optimal Interpolation (EnOI) system, across a common modelling and observational framework. Both systems assimilate the same observations including: satellite-derived sea-surface height, sea-surface temperature, vertical profiles of temperature and salinity (from Argo floats), and temperature profiles from eXpendable Bathy-Thermographs. We analyse both systems' performance against independent data that is withheld allowing a thorough analysis of system performance. The 4D-Var system is 25 times more expensive but outperforms the EnOI system against both assimilated and independent observations at the surface and subsurface. For forecast horizons of 5-days Root-mean-squared forecast errors are 20–60 % higher for the EnOI system compared to the 4D-Var system. The 4D-Var system, which assimilates observations over 5-day windows, provides a smoother transition from the end of the forecast to the subsequent analysis field. The EnOI system displays elevated low frequency (>1 day), surface intensified variability in temperature, and elevated kinetic energy at length scales less than 100 km at the beginning of the forecast windows. The 4D-Var system displays elevated energy in the near-inertial range throughout the water column, with the wavenumber kinetic energy spectra remaining unchanged upon assimilation. Overall, this comparison shows quantitatively that the 4D-Var system results in improved predictability as the analysis provides a smoother and more dynamically-balanced fit between the observations and the model's time-evolving flow. This advocates the use of advanced, time-dependent data assimilation methods, particularly for highly variable oceanic regions, and motivates future work into further improving data assimilation schemes.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(4664 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4664 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2355', Anonymous Referee #1, 13 Nov 2023
This manuscripts presents a very detailed and thorough comparison of two ocean data assimilation methods to East Australia Current (EAC): 4D-Var and EnOI. 4D-Var is a computationally intensive method that takes full advantage of the time-dependence of the circulation and ocean dynamics as described by the model, while EnOI is less computationally demanding, and uses information that is static in time. Both systems are currently employed in the Australia marine community, so a comparison of the two of considerable interest. I congratulate the authors on a very nice study which will be of interest to the broader oceanographic data assimilation community. I recommend publication after the authors have addressed my comments below, most of which are relatively minor.
Â
The exceptions are section 2.4 and section 4.3.Â
Section 2.4 needs a bit of an overhaul since there is some repetition and the notation used throughout is not consistent. See below for more detailed comments.
Section 4.3 is highly speculative and unconvincing for this reviewer.
Lines 64-66: The reference to 3DVar in NWP seems a bit out of place here. A better ocean reference here would be NEMOVAR run at ECMWF which uses 3D-Var.
Line 214: H(.) does not have to be linear. In 4D-Var, for example, it includes the nonlinear model.
Line 214: Replace "interpolates" with "samples"
Line 220: Your equation for G does not represent the general case. G is the tangent linearization of H(.).  The equationn G=H*M stated here implies a single observation time at the end of the forecast window. More generally, G would be given by the sum of terms involving H*M at each of the observation time.
Line 221: Do not use bold font here since this implies that this is a matrix. The forecast model though will, in general, be non-linear so it cannot be represented by a matrix. Use the same font as you use at line 280.
Line 221: Replace "model" with "nonlinear model"
Line 249: This M should be "Mf"
Lines 259-260: Can you say a bit more about the localization operator - what localization function do you use, and where is it applied in the equivalent of equation (2)?
Equation (4): Use upper-case X to be consistent with equation (1).
Equation (5): Replace HM by G to be consistent with equation (2).
Line 280: The the font you use here to represent the nonlinear model should be what you use at line 221.
Line 282: Insert "... introduced above" after the expression for d.
Line 282: The "H" operator used in your expression for d and later on this line should not be bold; it is the same as the nonlinear observation operator introduced at line 215.
Line 282: Replace "interpolates" with "samples"
Line 282: Replace lower-case x with upper-case X. Also you introduce superscript "b" here while in equation (1) you use "f". Use a consistent notation throughout otherwise it looks like you are talking about different objects.
Line 283: Replace "P" with "B" (here and throughout) to be consistent with equation (2).
Equation 6: Replace "HM" with "G" and "P" with "B"
Lines 289 and 290: Delete equation (7) and line 290 since they are irrelevant here. It is (8) that is consistent with the form of the Kalman filter gain matrix.
Equation (8): Use superscript "a" instead of subscript to be consistent with equation (1).
Lines 293-295: Delete the last two full sentences -Â this is repetitive information.
Line 298: Replace HM with G.
Line 298: The sentence beginning "The adjoint model then computes..." is true only for the primal form of 4D-Var which I understandyou are not using here.
Line 317: Delete "univariate covariance" and replace Kb with Kb=I.
Line 318: Replace P with B.
Line 320: It would be helpful to include a table here that summarizes the correlation lengthsassumed for the control vector elements.
Line 333: In the 4D-Var experiments, are you adjusting only the initial conditions, or are you adjusting the surface forcing and open boundary conditions as well? If the latter, this represents another significant difference between the 4D-Var strategy and the EnOI strategy.
Line 334 and 335: Another advantage of the EnKF is that the ensembles members can be run simultaneously if sufficient computing resources are available.
Line 338 and 339: Observation impacts can be computed from ensemble methods also using ensemble FSOI e.g.: Liu, J., Kalnay, E., 2008. Estimating observation impact without adjoint model in an ensemble Kalman filter. Q. J. R. Meteorol. Soc. 134, 1327–1335.
Table 1: What does "PER MAD" refer to in the table? It is not mentioned anywhere in the main text. Remove from the table if is not relevant.
Lines 449 and 450: Do these discontinuities/differences correspond to the DA increments?
Lines 503 and 504: This statement is not necessarily always true. If the model background is deficient at some space- and/or time-scale, then these may be corrected by DA so that the analyses and forecasts are better.Â
Figure 11: It would be helpful to also show the mean and sqrt of the variance from a free run of the model without DA to see how assimilation changes these fields along the two section shown.
Lines 537 and 538: I think it is a stretch to think that you are adequately resolving the submesoscale here.
Lines 547 and 548: Did you actually calculate the spectral transfer function? The approximate slope of the wavenumber spectrum alone is not enough to infer that there is an inverse energy cascade. While dx=2.5 km over the shelf, the effective resolution of the model is probably more like 3dx or 4dx, and off the shelf dx is larger. Various published studies show that there is a forward energy cascade at the ocean submesoscale when it is adequately resolved. There is also a suggestion that the canonical slope should be -2. The following is an excellent review article: McWilliams, J.C., 2016: Proc. R. Soc. A, 472, 20160117.
Line 550: After cascade insert "and consistent with". That said though, I am not convinced by the arguments you make here unless you demonstrate by direct calculation that there is in fact an inverse energy cascade in your model.
Lines 531-556: I find this whole section to be quite speculative. The canonical spectral slopes for QG and SQG are derived from highly idealized, and unforced simulations. Numerous model studies with forcing, and enhanced beta-effect (i.e. bathmetry) indicate that other slopes are possible. In addition, are the arguments made here consistent with the barotropic and baroclinic conversions discussed earlier. Barotropic and baroclinic instabilities are fundamentally very different in nature, so it is not clear to what extent one would expect the canonical cases to apply in a mixture of the two.
Line 663: Replace PÂ with B.
Citation: https://doi.org/10.5194/egusphere-2023-2355-RC1 -
AC1: 'Reply on RC1', Colette Kerry, 15 Dec 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2355/egusphere-2023-2355-AC1-supplement.pdf
- AC2: 'Reply to RC1', Colette Kerry, 15 Dec 2023
-
AC1: 'Reply on RC1', Colette Kerry, 15 Dec 2023
-
RC2: 'Comment on egusphere-2023-2355', Anonymous Referee #2, 15 Dec 2023
The present study compares two DA methods, namely, EnOI and 4DVAR, using a regional ocean modelling system for forecasting the East Australian Current over a two-year period. While the manuscript is well written, I cannot understand the scientific basis of this study. It is well known that time-independent models are faster and less accurate, while time-dependent assimilation models are time-consuming but relatively more accurate. The authors provide justification for the study's objective to contrast the two DA system configurations that each user group has customised. However, the fine-tuning of DA models by respective groups is not based on a common goal. At the outset, the comparison only reinforces the fact that the rate of forecast error growth is smaller in the 4DVAR model as compared to the EnOI. Perhaps the EnOI and 4DVAR groups can identify some common goals (such as acceptable computational time and resource, acceptable rate of forecast error growth, etc) and see if any common parameters exist that can have a significant impact.
Citation: https://doi.org/10.5194/egusphere-2023-2355-RC2 -
AC3: 'Reply on RC2', Colette Kerry, 18 Dec 2023
We thank the reviewer for taking the time to read the manuscript. We agree that it is well known that time-independent models are faster and less accurate, while time-dependent assimilation models are time-consuming but relatively more accurate, however detailed comparisons across common platforms to reveal the specific differences are uncommon in the scientific literature. The scientific basis of this study is two fold. Firstly we show in detail how the predictive skill of the two systems differ against both assimilated observations and a comprehensive set of independent observations. Secondly, given these differences, we delve into model space to understand the different temporal and spatial scales of variability that come about due to each data assimilation system. The stark difference is of scientific interest and has never been shown before in the literature.
It is not clear what the reviewer means by "the fine-tuning of DA models by respective groups is not based on a common goal." As we emphasise in the discussion, the goal of this study is quantify the rate of forecast error growth and the response of the ocean state to the assimilation methodology, as well as to set a baseline for future comparisons. We feel that this latter goal is important given the move towards more advanced DA systems in operational centres and the increasing use of DA in regional forecast systems. As we state in the discussion, the goal of this study was not to compare various versions of each DA method nor to compare the fit in the analyses. Rather we focus on the rate of forecast error growth and the response of the ocean state to each assimilation methodology. As such, the study’s utility and scientific relevance is significant without a large number of comparisons across different parameters or DA system configurations.
In regards to the reviewer's final comment, "Perhaps the EnOI and 4DVAR groups can identify some common goals (such as acceptable computational time and resource, acceptable rate of forecast error growth, etc) and see if any common parameters exist that can have a significant impact.",  we do not think that it is necessary to attempt to build two systems with the same computational requirements or the same forecast skill for the study's comparison to be of scientific interest. We feel that we have adequately addressed the advantages and drawbacks of each system depending on the specific goals. For example, in the discussion we state that " The EnOI system is ∼25 times cheaper than the 4D-Var system presented here. It is noted that EnOI has been effective for long-term reanalysis products where analyses were created every day (Oke et al., 2008b; Chamberlain et al., 2021b) and forecasts were not required. With increasing computational capacity and the pursuit of more accurate ocean forecasts, this study’s comparison motivates the use of 4D-Var over EnOI for ocean forecasts of the EAC region. This result is likely to be applicable over similar, highly variable, oceanic regions such as WBCs. More generally, the comparison advocates for the use of advanced time-dependent DA schemes over time-independent methods. We illustrate how a DA scheme can influence forecast skill which motivates future development of DA methods."
Citation: https://doi.org/10.5194/egusphere-2023-2355-AC3
-
AC3: 'Reply on RC2', Colette Kerry, 18 Dec 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2355', Anonymous Referee #1, 13 Nov 2023
This manuscripts presents a very detailed and thorough comparison of two ocean data assimilation methods to East Australia Current (EAC): 4D-Var and EnOI. 4D-Var is a computationally intensive method that takes full advantage of the time-dependence of the circulation and ocean dynamics as described by the model, while EnOI is less computationally demanding, and uses information that is static in time. Both systems are currently employed in the Australia marine community, so a comparison of the two of considerable interest. I congratulate the authors on a very nice study which will be of interest to the broader oceanographic data assimilation community. I recommend publication after the authors have addressed my comments below, most of which are relatively minor.
Â
The exceptions are section 2.4 and section 4.3.Â
Section 2.4 needs a bit of an overhaul since there is some repetition and the notation used throughout is not consistent. See below for more detailed comments.
Section 4.3 is highly speculative and unconvincing for this reviewer.
Lines 64-66: The reference to 3DVar in NWP seems a bit out of place here. A better ocean reference here would be NEMOVAR run at ECMWF which uses 3D-Var.
Line 214: H(.) does not have to be linear. In 4D-Var, for example, it includes the nonlinear model.
Line 214: Replace "interpolates" with "samples"
Line 220: Your equation for G does not represent the general case. G is the tangent linearization of H(.).  The equationn G=H*M stated here implies a single observation time at the end of the forecast window. More generally, G would be given by the sum of terms involving H*M at each of the observation time.
Line 221: Do not use bold font here since this implies that this is a matrix. The forecast model though will, in general, be non-linear so it cannot be represented by a matrix. Use the same font as you use at line 280.
Line 221: Replace "model" with "nonlinear model"
Line 249: This M should be "Mf"
Lines 259-260: Can you say a bit more about the localization operator - what localization function do you use, and where is it applied in the equivalent of equation (2)?
Equation (4): Use upper-case X to be consistent with equation (1).
Equation (5): Replace HM by G to be consistent with equation (2).
Line 280: The the font you use here to represent the nonlinear model should be what you use at line 221.
Line 282: Insert "... introduced above" after the expression for d.
Line 282: The "H" operator used in your expression for d and later on this line should not be bold; it is the same as the nonlinear observation operator introduced at line 215.
Line 282: Replace "interpolates" with "samples"
Line 282: Replace lower-case x with upper-case X. Also you introduce superscript "b" here while in equation (1) you use "f". Use a consistent notation throughout otherwise it looks like you are talking about different objects.
Line 283: Replace "P" with "B" (here and throughout) to be consistent with equation (2).
Equation 6: Replace "HM" with "G" and "P" with "B"
Lines 289 and 290: Delete equation (7) and line 290 since they are irrelevant here. It is (8) that is consistent with the form of the Kalman filter gain matrix.
Equation (8): Use superscript "a" instead of subscript to be consistent with equation (1).
Lines 293-295: Delete the last two full sentences -Â this is repetitive information.
Line 298: Replace HM with G.
Line 298: The sentence beginning "The adjoint model then computes..." is true only for the primal form of 4D-Var which I understandyou are not using here.
Line 317: Delete "univariate covariance" and replace Kb with Kb=I.
Line 318: Replace P with B.
Line 320: It would be helpful to include a table here that summarizes the correlation lengthsassumed for the control vector elements.
Line 333: In the 4D-Var experiments, are you adjusting only the initial conditions, or are you adjusting the surface forcing and open boundary conditions as well? If the latter, this represents another significant difference between the 4D-Var strategy and the EnOI strategy.
Line 334 and 335: Another advantage of the EnKF is that the ensembles members can be run simultaneously if sufficient computing resources are available.
Line 338 and 339: Observation impacts can be computed from ensemble methods also using ensemble FSOI e.g.: Liu, J., Kalnay, E., 2008. Estimating observation impact without adjoint model in an ensemble Kalman filter. Q. J. R. Meteorol. Soc. 134, 1327–1335.
Table 1: What does "PER MAD" refer to in the table? It is not mentioned anywhere in the main text. Remove from the table if is not relevant.
Lines 449 and 450: Do these discontinuities/differences correspond to the DA increments?
Lines 503 and 504: This statement is not necessarily always true. If the model background is deficient at some space- and/or time-scale, then these may be corrected by DA so that the analyses and forecasts are better.Â
Figure 11: It would be helpful to also show the mean and sqrt of the variance from a free run of the model without DA to see how assimilation changes these fields along the two section shown.
Lines 537 and 538: I think it is a stretch to think that you are adequately resolving the submesoscale here.
Lines 547 and 548: Did you actually calculate the spectral transfer function? The approximate slope of the wavenumber spectrum alone is not enough to infer that there is an inverse energy cascade. While dx=2.5 km over the shelf, the effective resolution of the model is probably more like 3dx or 4dx, and off the shelf dx is larger. Various published studies show that there is a forward energy cascade at the ocean submesoscale when it is adequately resolved. There is also a suggestion that the canonical slope should be -2. The following is an excellent review article: McWilliams, J.C., 2016: Proc. R. Soc. A, 472, 20160117.
Line 550: After cascade insert "and consistent with". That said though, I am not convinced by the arguments you make here unless you demonstrate by direct calculation that there is in fact an inverse energy cascade in your model.
Lines 531-556: I find this whole section to be quite speculative. The canonical spectral slopes for QG and SQG are derived from highly idealized, and unforced simulations. Numerous model studies with forcing, and enhanced beta-effect (i.e. bathmetry) indicate that other slopes are possible. In addition, are the arguments made here consistent with the barotropic and baroclinic conversions discussed earlier. Barotropic and baroclinic instabilities are fundamentally very different in nature, so it is not clear to what extent one would expect the canonical cases to apply in a mixture of the two.
Line 663: Replace PÂ with B.
Citation: https://doi.org/10.5194/egusphere-2023-2355-RC1 -
AC1: 'Reply on RC1', Colette Kerry, 15 Dec 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2355/egusphere-2023-2355-AC1-supplement.pdf
- AC2: 'Reply to RC1', Colette Kerry, 15 Dec 2023
-
AC1: 'Reply on RC1', Colette Kerry, 15 Dec 2023
-
RC2: 'Comment on egusphere-2023-2355', Anonymous Referee #2, 15 Dec 2023
The present study compares two DA methods, namely, EnOI and 4DVAR, using a regional ocean modelling system for forecasting the East Australian Current over a two-year period. While the manuscript is well written, I cannot understand the scientific basis of this study. It is well known that time-independent models are faster and less accurate, while time-dependent assimilation models are time-consuming but relatively more accurate. The authors provide justification for the study's objective to contrast the two DA system configurations that each user group has customised. However, the fine-tuning of DA models by respective groups is not based on a common goal. At the outset, the comparison only reinforces the fact that the rate of forecast error growth is smaller in the 4DVAR model as compared to the EnOI. Perhaps the EnOI and 4DVAR groups can identify some common goals (such as acceptable computational time and resource, acceptable rate of forecast error growth, etc) and see if any common parameters exist that can have a significant impact.
Citation: https://doi.org/10.5194/egusphere-2023-2355-RC2 -
AC3: 'Reply on RC2', Colette Kerry, 18 Dec 2023
We thank the reviewer for taking the time to read the manuscript. We agree that it is well known that time-independent models are faster and less accurate, while time-dependent assimilation models are time-consuming but relatively more accurate, however detailed comparisons across common platforms to reveal the specific differences are uncommon in the scientific literature. The scientific basis of this study is two fold. Firstly we show in detail how the predictive skill of the two systems differ against both assimilated observations and a comprehensive set of independent observations. Secondly, given these differences, we delve into model space to understand the different temporal and spatial scales of variability that come about due to each data assimilation system. The stark difference is of scientific interest and has never been shown before in the literature.
It is not clear what the reviewer means by "the fine-tuning of DA models by respective groups is not based on a common goal." As we emphasise in the discussion, the goal of this study is quantify the rate of forecast error growth and the response of the ocean state to the assimilation methodology, as well as to set a baseline for future comparisons. We feel that this latter goal is important given the move towards more advanced DA systems in operational centres and the increasing use of DA in regional forecast systems. As we state in the discussion, the goal of this study was not to compare various versions of each DA method nor to compare the fit in the analyses. Rather we focus on the rate of forecast error growth and the response of the ocean state to each assimilation methodology. As such, the study’s utility and scientific relevance is significant without a large number of comparisons across different parameters or DA system configurations.
In regards to the reviewer's final comment, "Perhaps the EnOI and 4DVAR groups can identify some common goals (such as acceptable computational time and resource, acceptable rate of forecast error growth, etc) and see if any common parameters exist that can have a significant impact.",  we do not think that it is necessary to attempt to build two systems with the same computational requirements or the same forecast skill for the study's comparison to be of scientific interest. We feel that we have adequately addressed the advantages and drawbacks of each system depending on the specific goals. For example, in the discussion we state that " The EnOI system is ∼25 times cheaper than the 4D-Var system presented here. It is noted that EnOI has been effective for long-term reanalysis products where analyses were created every day (Oke et al., 2008b; Chamberlain et al., 2021b) and forecasts were not required. With increasing computational capacity and the pursuit of more accurate ocean forecasts, this study’s comparison motivates the use of 4D-Var over EnOI for ocean forecasts of the EAC region. This result is likely to be applicable over similar, highly variable, oceanic regions such as WBCs. More generally, the comparison advocates for the use of advanced time-dependent DA schemes over time-independent methods. We illustrate how a DA scheme can influence forecast skill which motivates future development of DA methods."
Citation: https://doi.org/10.5194/egusphere-2023-2355-AC3
-
AC3: 'Reply on RC2', Colette Kerry, 18 Dec 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
258 | 95 | 13 | 366 | 9 | 8 |
- HTML: 258
- PDF: 95
- XML: 13
- Total: 366
- BibTeX: 9
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
Colette Gabrielle Kerry
Moninya Roughan
Shane Keating
David Gwyther
Gary Brassington
Adil Siripitana
Joao Marcos A. C. Souza
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4664 KB) - Metadata XML