Validating the spatial variability of the semidiurnal internal tide in a realistic global ocean simulation with Argo and mooring data

Geoffroy, Gaspard; Nycander, Jonas; Buijsman, Maarten C.; Shriver, Jay F.; Arbic, Brian K.

doi:https://doi.org/10.5194/egusphere-2022-1085

Preprints

https://doi.org/10.5194/egusphere-2022-1085

Preprints

24 Oct 2022

| 24 Oct 2022

Validating the spatial variability of the semidiurnal internal tide in a realistic global ocean simulation with Argo and mooring data

Gaspard Geoffroy, Jonas Nycander, Maarten C. Buijsman, Jay F. Shriver, and Brian K. Arbic

Abstract. The total variance and decorrelation of the semidiurnal internal tide (IT) are examined in a 32-day segment of a global run of the HYbrid Coordinate Ocean Model (HYCOM). This numerical simulation, with 41 vertical layers and 1/25 degree horizontal resolution, includes tidal and atmospheric forcing allowing for the generation and propagation of IT to take place within a realistic eddying general circulation. The HYCOM data are in turn compared with global observations of the IT around 1,000 dbar, from Argo float park phase data and mooring records. HYCOM is found to be globally biased low in terms of total variance and decorrelation of the semidiurnal IT over timescales shorter than 32 days. Except in the Southern Ocean, where limitations in the model causes the discrepancy with in situ measurements to grow poleward, the spatial correlation between the Argo and HYCOM inferred total variance suggests that the generation of low-mode semidiurnal IT is globally well captured by the model.

Received: 12 Oct 2022 – Discussion started: 24 Oct 2022

Download & links

Preprint (PDF, 6838 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (6838 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

12 Jun 2023

Validating the spatial variability in the semidiurnal internal tide in a realistic global ocean simulation with Argo and mooring data

Gaspard Geoffroy, Jonas Nycander, Maarten C. Buijsman, Jay F. Shriver, and Brian K. Arbic

Ocean Sci., 19, 811–835, https://doi.org/10.5194/os-19-811-2023,https://doi.org/10.5194/os-19-811-2023, 2023

Short summary

Gaspard Geoffroy et al.

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1085', Anonymous Referee #1, 21 Nov 2022

The authors have made an interesting analysis of semidiurnal tides

in the HYCOM model using the time-lagged Eulerian and Lagrangian

autocovariances of vertical isotherm displacement. They compared output

from a 32-day-long HYCOM simulation with Argo park mode data and

moored thermistor data, and found that the "total internal tide"

variance in HYCOM is too small, especially in the far Southern Ocean.

They also found that the Eulerian and Lagrangian estimates of the

tidal variance at near-zero lag agree very well using HYCOM data, which

bolsters this analysis and their previous analysis of Argo data.

Finally, they used Caspar-Cohen's technique to estimate the "intrinsic"

and "apparent" decorrelation times of the internal tide, and found that

the observed tides decorrelate faster than the HYCOM tides. The authors

make the interesting observation that the mean (stationary) IT in

HYCOM is too large compared to altimetry, but the total IT

(stationary + nonstationary) is too small compared to Argo and moorings.

They discussed some reasons for the discrpancies between HYCOM and the

observations, but it was unclear if any of their suggestions could

explain the quantitative differences. With regard to the too stationary

tides in HYCOM, they did not mention the possible roles of missing

small-scale mesoscale or submesoscale variability in HYCOM or the deficit

of high-frequency wind forcing.

Overall, this is a nice piece of work which I think will be of interest

to many readers of Ocean Science. I have many small comments, listed

below. While I would say I have no major concerns, my comments could justify

some new analyses or revisions of results presented, so I recommend Major Revision.

Comments:

l1: Is "total" needed? Why not omit or say "tidal"?

Throughout the abstract, "total" is used, but it is not contrasted with

"partial" or another "non-total" quantity to understand what distinction is

implied by "total".

l11: "beams" -> "waves" or "beams of waves"

l17: Omit "at any given position", since later in the sentence you state that

you are referring specifically to "their generation site".

l19: "causes" -> "cause"

l23-l27: I think I understand what the authors' are getting at, but I found

the first three sentences confusing. When we look at the plot of an autocovariance,

such as is suggested by the first phrase, we would see the envelope of

autocovariance decay, and the coherent fraction of the signal

will dominate the autocovariance at long lag. It seems like this paragraph

may be muddling the ideas of what happens to the autocovariance

as a function of increasing lag, versus what happens to the wave

energy as a function of increasing propagation distance. I would suggest

re-thinking the purpose of this paragraph and re-writing it to more-clearly

articulate the point you wish to make.

l42: Once again, "total" is used without distinguishing it properly. It seems

like it should be clearly defined above, when the ideas of the coherent and

incoherent signals are defined.

l52: Finally, "the autocovariance at short time lags", is identified with the

"total variance". Some sort of explanation needs to be provided earlier.

But how is tidal variability distinguished from noise and high-frequency

ocean variability when looking at the "short time lags"?

l52: "On the other hand" -- I am not clear what is the other part of the

contrast. Omit this phrase?

l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought

to be one which is unaffected by extrinsic factors. But, the decorrelation is

entirely caused by interactions with the propagation medium. Perhaps it is

best to stick with the Eulerian vs Lagrangian distinction, and when the

autocovariance is discussed, it seems like you need to be clear whether

you are discussing an Eulerian or Lagrangian autocorrelation.

l63: omit "the strength of"

l73: "can vary" by how much?

l81: Did you use exactly the same datset as in Geoffroy and Nycander (2022)?

I would be interested to know how many 32-day records there are, from how

many individual drifters. Also, can you remind us exactly what the "data"

consist of? Is it time series of isopycnal displacement, inferred from

temperature measurements during the part phase, using temperature profiles from

the start and end?

l111: Is dT/dP in the numerator the same as dTbar/dP?

l114: "obtained" -> "estimated"

l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't

this a biased estimator when the expected value is taken for fixed N?

l140: I do not understand why the sine component is included. The autocovariance

is an even function, so any projection onto the sine must be noise, right?

Likewise, I don't understand the total error defined in equation (4). And

why would a robust estimator (median for \tilde{SEM}) be combined with a

non-robust estimator (Var A)?

l157: "not significantly different" -- Well, I agree that they do fall within

each others' standard errors, but they look significantly different to me.

What is the probability of the offset over so many different lags; how many

d.o.f. do you think are in these estimates?

l170: Can you explain why you estimated the Eulerian autocovariance along

the Lagrangian trajectory? Are you trying to account for the geographic

variability of the Eulerian autocovariance?

Fig 3: This is related to the above question: Why are the Eulerian error

bars so small compared to the Lagrangian? Maybe you could spend a little

more time explaining how this plot relates to Fig 2. Are the Lagrangian

HYCOM curves in Fig 2 identical to those in Fig 3?

Fig 4: Maybe use the same color for the HYCOM curves in each plot?

Is the red curve in Fig 3b the same as the black curve in Fig 4b? They

seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance,

but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b,

but R(740hr) > 10m^2 in Fig 4b).

l190: "for each particle" -> "for each HYCOM particle" ? But if this paragraph

applies to HYCOM, how can there be outliers?

l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory?

This would seem to use so many more degrees of freedom compared to the Lagrangian

estimates. I am not sure why this is done or why it would be justified.

Paragraph at line 195: This is a very good comparison and a little

surprising, to me.

Fig 6: Why are the maps drawn so small?

Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.

l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?

l225: Would it be fair to guess that there are also many Argo trajectories that

were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff

criterion? I wonder if you would see the difference in HYCOM vs Argo if

you made a Fig 8 based on the drift speed?

l227: Once again, "intrinsic" does not seem to be the right word.

l235: Is this an expected property of the Rayleigh distribution for the

pdf of the modulated wave amplitude? You might want to look into this in

the acoustics or optics literature. I don't believe this has been observed

previously for narrowband ocean internal waves.

l247: Why are you comparing the SCVF_15 statistic? This is a ratio of

sample statistics and likely to be very noisy. I don't really know what to

make of Fig 9a. With such a small dataset, I would like to see the ratio of

total variance (the demodulate amplitude at tau=48hr), instead.

l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the

demodulate amplitude at 48hr lag. Why not use that same quantity for

comparison? Oh -- I see it in Table 1.

l262: I am not sure what the "discrepancy" refers to.

l265: SCVF^{15} -> SCVF_{15}

l273: I don't think "no impact" is the correct way to characterize the previous

results. There is considerable scatter in Fig 5, and Fig 3b shows that the

estimates differ. Also, it is unclear to me why you don't try to make the

estimates more consistent by extrapolating the demodulate amplitude to zero lag.

l275: "mecanism" -> "mechanism"

l276: Why not just call it "Lagrangian decorrelation" instead of "apparent

decorrelation"? If I had been a reviewer on Gaspar-Cohen, I would have made the

same suggestion.

l306: "sinusoide" -> "sinusoid"

Table 2: It is interesting that the \omega_{AM} frequency corresponds to

M2-S2 beating, but the amplitude (\sigma^2_{AM}) does not.

l330-358: Modes discussion.

l364-377: Bathymetry discussion. Surely the importance of the errors depends

on the horizontal spatial scale of the errors. While this is interesting

discussion, it ought to consider the wavenumber spectrum of the error.

l386-401: Stratification.

None of these discussions really deal with the overall quantitative

difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or

0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems

in terms of their spatial coverage, but the Argo comparison seems much more

meaningful. I am unclear which of the authors' proposed sources of

bias could account for the 26% deficit compared to Argo.

l403: "run with" -> "run of"

l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of

the 0.74 value at l216? Have I misunderstood this?

l422: "stationnary" -> "stationary"; also, I think the "big O" notation should

be reserved for asymptotics, and here it is better to say "about" or

"approximately". Finally, I don't think "becomes stationary" is an appropriate

descriptor; this would be like saying a time series "becomes its mean".

l429: "unaffected" -> "unaffected in the mean"

l339: "supposedly account" -> "supposedly accounts"

l452: latex formatting needs help in the URL.

l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html

References: inconsistent capitalization is used in article titles

l557: "and contributors, T. P."

Citation: https://doi.org/10.5194/egusphere-2022-1085-RC1
- AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023
  
  We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
  General comments:
  The authors have made an interesting analysis of semidiurnal tides in the HYCOM model using the time-lagged Eulerian and Lagrangian autocovariances of vertical isotherm displacement. They compared output from a 32-day-long HYCOM simulation with Argo park mode data and moored thermistor data, and found that the "total internal tide" variance in HYCOM is too small, especially in the far Southern Ocean. They also found that the Eulerian and Lagrangian estimates of the tidal variance at near-zero lag agree very well using HYCOM data, which bolsters this analysis and their previous analysis of Argo data. Finally, they used Caspar-Cohen's technique to estimate the "intrinsic" and "apparent" decorrelation times of the internal tide, and found that the observed tides decorrelate faster than the HYCOM tides. The authors make the interesting observation that the mean (stationary) IT in HYCOM is too large compared to altimetry, but the total IT (stationary + nonstationary) is too small compared to Argo and moorings.
  They discussed some reasons for the discrepancies between HYCOM and the observations, but it was unclear if any of their suggestions could explain the quantitative differences. With regard to the too stationary tides in HYCOM, they did not mention the possible roles of missing small-scale mesoscale or submesoscale variability in HYCOM or the deficit of high-frequency wind forcing.
  Overall, this is a nice piece of work which I think will be of interest to many readers of Ocean Science. I have many small comments, listed below. While I would say I have no major concerns, my comments could justify some new analyses or revisions of results presented, so I recommend Major Revision.
  
  Comments:
  l1: Is "total" needed? Why not omit or say "tidal"? Throughout the abstract, "total" is used, but it is not contrasted with "partial" or another "non-total" quantity to understand what distinction is implied by "total".
  Ans.: Reworked abstract. Suppressed “total”.
  
  l11: "beams" -> "waves" or "beams of waves"
  Ans.: Corrected
  
  l17: Omit "at any given position", since later in the sentence you state that you are referring specifically to "their generation site".
  Ans.: The sentence is correct: the phase difference accounts for the propagation of the waves from the generation site to any given position, but it is constant in time.
  
  l19: "causes" -> "cause"
  Ans.: Corrected
  
  l23-l27: I think I understand what the authors' are getting at, but I found the first three sentences confusing. When we look at the plot of an autocovariance, such as is suggested by the first phrase, we would see the envelope of autocovariance decay, and the coherent fraction of the signal will dominate the autocovariance at long lag. It seems like this paragraph may be muddling the ideas of what happens to the autocovariance as a function of increasing lag, versus what happens to the wave energy as a function of increasing propagation distance. I would suggest re-thinking the purpose of this paragraph and re-writing it to more-clearly articulate the point you wish to make.
  Ans.: Reworked paragraph, re-centering it on the stationary/nonstationary wave field.
  
  l42: Once again, "total" is used without distinguishing it properly. It seems like it should be clearly defined above, when the ideas of the coherent and incoherent signals are defined.
  Ans.: “total” was defined earlier in the text (l.25). Added italic font.
  
  l52: Finally, "the autocovariance at short time lags", is identified with the "total variance". Some sort of explanation needs to be provided earlier. But how is tidal variability distinguished from noise and high-frequency ocean variability when looking at the "short time lags"?
  Ans.: “total” was defined earlier in the text (l.25). Provided short explanation on how noise is filtered out.
  
  l52: "On the other hand" -- I am not clear what is the other part of the contrast. Omit this phrase?
  Ans.: Contrasts the Lagrangian decorrelation downside with the fine time resolution upside.
  
  l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought to be one which is unaffected by extrinsic factors. But, the decorrelation is entirely caused by interactions with the propagation medium. Perhaps it is best to stick with the Eulerian vs Lagrangian distinction, and when the autocovariance is discussed, it seems like you need to be clear whether you are discussing an Eulerian or Lagrangian autocorrelation.
  Ans.: Agreed. Replaced "intrinsic decorrelation" by "decorrelation" or "decorrelation of the IT", and further "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
  
  l63: omit "the strength of"
  Ans.: Corrected
  
  l73: "can vary" by how much?
  Ans.: Replaced by “The sampling period of the park phase can occasionally vary by more than a few seconds.” The vast majority of the park phases we use have a sampling period of 1h. Very rarely do park phases have sampling periods significantly shorter (and even more rarely longer) than one hour.
  
  l81: Did you use exactly the same dataset as in Geoffroy and Nycander (2022)? I would be interested to know how many 32-day records there are, from how many individual drifters. Also, can you remind us exactly what the "data" consist of? Is it time series of isopycnal displacement, inferred from temperature measurements during the park phase, using temperature profiles from the start and end?
  Ans.: Added information at the beginning of section 2, and in section 2 and 3 (isotherms displacement is properly defined in section 3).
  
  l111: Is dT/dP in the numerator the same as dTbar/dP?
  Ans.: No: Tbar in the numerator is not a function of z, and they are calculated independently.
  
  l114: "obtained" -> "estimated"
  Ans.: Corrected
  
  l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't this a biased estimator when the expected value is taken for fixed N?
  Ans.: This estimator is unbiased: for increasingly large N, the estimator converges to the true value.
  
  l140: I do not understand why the sine component is included. The autocovariance is an even function, so any projection onto the sine must be noise, right ? [...]
  Ans.: Indeed, the acov is an even function. However, when two tidal constituents close in frequency are present, the autocovariance of the resulting beating can be expressed in terms of a cosine and a sine component (short derivation attached)
  […] Likewise, I don't understand the total error defined in equation (4). And why would a robust estimator (median for \tilde{SEM}) be combined with a non-robust estimator (Var A)?
  Ans.: The definition of the total error in equation (4) was not rigorous. Referee #2 also pointed at the wrong assumption of Gaussian statistics when computing the confidence interval of the complex demodulate. For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulate.
  
  l157: "not significantly different" -- Well, I agree that they do fall within each others' standard errors, but they look significantly different to me. What is the probability of the offset over so many different lags; how many d.o.f. do you think are in these estimates?
  Ans: Modified the sentence: “Apart from the first couple of demodulates, the HYCOM demodulate series appears consistently smaller than the Argo one.“ The number of d.o.f. is not straightforward to estimate, mainly because one first has to estimate how many independent values our 767-h records contain. Eq. (24) in Awe’s 1964 paper “Errors in correlation between time series” gives a way to compute such an estimate. However, since we do not know the underlying true autocovariance function, we cannot precisely evaluate this. Using the local mean autocovariance computed from Argo data for the local example shown in Fig. 2, we estimate that values separated by L~12 h may be considered independent of one another. Hence, when computing the mean autocovariance at a given \tau, we have roughly N_p*(N-\tau)/L d.o.f., with the number of 32-day records N_p=8 and the number of values in each record N=767. For small \tau, we have approximately 8*767/12 ~ 500 d.o.f. For the corresponding Lagrangian data from HYCOM we get ~250 d.o.f. (here L~39 h). Thus, they have largely enough d.o.f. to be considered different. The new confidence interval estimates reflect that conclusion.
  
  l170: Can you explain why you estimated the Eulerian autocovariance along the Lagrangian trajectory? Are you trying to account for the geographic variability of the Eulerian autocovariance? Ans.: Precisely. That way we are not introducing any discrepancy due to the (random) Lagrangian spatial sampling. Added explanations in the text.
  
  Fig 3: This is related to the above question: Why are the Eulerian errorbars so small compared to the Lagrangian? Maybe you could spend a little more time explaining how this plot relates to Fig 2. Are the Lagrangian HYCOM curves in Fig 2 identical to those in Fig 3?
  Ans.: The Eulerian mean autocovariance uses many more d.o.f. There are about 60 times more 32-day segments that are used to compute the Eulerian mean autocovariance resulting in about 4600 d.o.f. (L is also larger, i.e. the data less independent). Added explanations in the main text. Added in the caption of Fig. 3 that the Lagrangian HYCOM curves in Fig. 2 and 3 are identical.
  
  Fig 4: Maybe use the same color for the HYCOM curves in each plot? Is the red curve in Fig 3b the same as the black curve in Fig 4b? They seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance, but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b, but R(740hr) > 10m^2 in Fig 4b).
  Ans.: The color choice we made is to have the HYCOM Lagrangian data plotted in black, and the in situ data in red whenever possible. The red curve in Fig 3b should indeed be the same as the black curve in Fig 4b (Added text in the caption of Fig.4). There was an error in the plotting script.
  
  l190: "for each particle" -> "for each HYCOM particle" ? But if this paragraph applies to HYCOM, how can there be outliers?
  Ans.: Replaced "for each particle" by "for each HYCOM particle". \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added explanations in the text.
  
  l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory? This would seem to use so many more degrees of freedom compared to the Lagrangian estimates. I am not sure why this is done or why it would be justified. Paragraph at line 195: This is a very good comparison and a little surprising, to me.
  Ans.: Yes, the HYCOM Eulerian autocovariances are computed all along the trajectory subsampled every 12 h. As written before, the HYCOM Eulerian autocovariance estimates do use many more degrees of freedom. As a result the formal error is much smaller.
  
  Fig 6: Why are the maps drawn so small?
  Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
  Ans.: Enlarged Fig. 6, 7, 8, 9, 13, 14.
  
  l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
  Ans.: The ratio itself is not plotted (added as a comment in the main text), however we write its mean value and standard deviation in the text.
  
  l225: Would it be fair to guess that there are also many Argo trajectories that were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff criterion? I wonder if you would see the difference in HYCOM vs Argo if you made a Fig 8 based on the drift speed?
  Ans.: This criterion removes roughly 3000 Argo segments (~15%), but the final maps only get ~1% less coverage. This mostly affects the Argo data density in the regions east of Drake's passage, east of Algulas, and in the Equatorial Pacific. Attached is a figure similar to Fig. 8 but based on the mean drift speed: the Argo mean autocovariance remains more or less the same while the Lagrangian HYCOM autocovariance is significantly smaller for drift speeds smaller than 0.033 m s-1. Coincidentally, the HYCOM bins with a mean speed smaller than 0.033 m s-1 are mostly poleward of 50 deg S.
  l227: Once again, "intrinsic" does not seem to be the right word.
  Ans.: Deleted "intrinsic".
  
  l235: Is this an expected property of the Rayleigh distribution for the pdf of the modulated wave amplitude? You might want to look into this in the acoustics or optics literature. I don't believe this has been observed previously for narrowband ocean internal waves.
  Ans.: As pointed at by referee #2, the complex demodulates are not Gaussian distributed. Following, our definition of the complex demodulate, if the fitted C and S are Gaussian random variables with a same standard deviation, then A = sqrt(C^2+S^2) follows a Rice distribution. We did not give more thought about the expected distribution of the local mean autocovariance.
  
  l247: Why are you comparing the SCVF_15 statistic? This is a ratio of sample statistics and likely to be very noisy. I don't really know what to make of Fig 9a. With such a small dataset, I would like to see the ratio of total variance (the demodulate amplitude at tau=48hr), instead.
  Ans.: Indeed the SCVF_15 statistic from local mean autocovariance was very noisy. We now only compute it from larger populations of sample autocovariances. Fig 9 is now showing the variance instead of SCVF_15.
  
  l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the demodulate amplitude at 48hr lag. Why not use that same quantity for comparison? Oh -- I see it in Table 1.
  Ans.: -
  
  l262: I am not sure what the "discrepancy" refers to.
  Ans.: Poor writing, rewrote the sentence.
  
  l265: SCVF^{15} -> SCVF_{15}
  Ans.: Corrected
  
  l273: I don't think "no impact" is the correct way to characterize the previous results. There is considerable scatter in Fig 5, and Fig 3b shows that the estimates differ. Also, it is unclear to me why you don't try to make the estimates more consistent by extrapolating the demodulate amplitude to zero lag.
  Ans.: Replaced “no impact” by “no significant impact”. We do not know what is the true autocovariance function and how it behaves close to 0 time lag. The first demodulate is a conservative, simple, and robust estimate of the IT variance. It is also consistent within this work.
  
  l275: "mecanism" -> "mechanism"
  Ans.: Corrected
  
  l276: Why not just call it "Lagrangian decorrelation" instead of "apparent decorrelation"? If I had been a reviewer on Gaspar-Cohen, I would have made the same suggestion.
  Ans.: Replaced "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
  
  l306: "sinusoide" -> "sinusoid"
  Ans.: Corrected
  
  Table 2: It is interesting that the \omega_{AM} frequency corresponds to M2-S2 beating, but the amplitude (\sigma^2_{AM}) does not.
  Ans.: The M2-S2 beating, although probably dominating the semidiurnal amplitude modulation, is definitely not the only contribution to this amplitude modulation (other constituents close to M2 play a role).
  
  l330-358: Modes discussion.
  Ans.: -
  
  l364-377: Bathymetry discussion. Surely the importance of the errors depends on the horizontal spatial scale of the errors. While this is interesting discussion, it ought to consider the wavenumber spectrum of the error.
  Ans.: We do not have quantitative information on this error spectrum.
  
  l386-401: Stratification. None of these discussions really deal with the overall quantitative difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or 0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems in terms of their spatial coverage, but the Argo comparison seems much more meaningful. I am unclear which of the authors' proposed sources of bias could account for the 26% deficit compared to Argo.
  Ans.: Added summary sentence at the end of Sect. 4.
  
  l403: "run with" -> "run of"
  Ans.: Corrected
  
  l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of the 0.74 value at l216? Have I misunderstood this?
  Ans.: This factor of 1.5 was already mentioned l.225. Contrarily to the 0.74 value at l.216, it does include latitudes north of 50 deg N. Rewrote with factors consistent throughout the section. Added table 1 to summarize the statistics for the different groups.
  
  l422: "stationnary" -> "stationary"; also, I think the "big O" notation should be reserved for asymptotics, and here it is better to say "about" or "approximately". Finally, I don't think "becomes stationary" is an appropriate descriptor; this would be like saying a time series "becomes its mean".
  Ans.: Corrected. Replaced “the IT becomes stationary” by “the IT autocovariance reaches its stationary limit“.
  
  l429: "unaffected" -> "unaffected in the mean"
  Ans.: Corrected
  
  l339: "supposedly account" -> "supposedly accounts"
  Ans.: Corrected
  
  l452: latex formatting needs help in the URL.
  Ans.: Corrected
  
  l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html References: inconsistent capitalization is used in article titles
  Ans.: Corrected
  
  l557: "and contributors, T. P."
  Ans.: Corrected
  
  Citation: https://doi.org/10.5194/egusphere-2022-1085-AC1
RC2:
'Comment on egusphere-2022-1085', Anonymous Referee #2, 07 Dec 2022

General comments:

This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.

I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning? I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.

For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:

- comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?

- comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?

- Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?

Specific comments and technical corrections:

Abstract: line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "auto-correlation" or "auto-covariance" which are established statistical terms. An abstract should be able to stand alone.

It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.

l46: It is not obvious (to me) what the k-space methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?

l71-72: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?

Section 2.3: Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?

l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?

l101: Why 41644? Does this correspond to a mean geographical density?

l106: The effects of the drift? Do you mean potential Lagrangian biases?

Section 3.1; eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?

l113-115: which monthly-mean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.

Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?

l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?

l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?

Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chi-square variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.

Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.

l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.

I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).

l190: "outliers": please use sentence to explain what you mean.

l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.

l198: "taken as ..." : state this earlier to remind the reader.

l208: a bias which means that HYCOM underestimate Argo, correct?

Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.

l214: The ratio increases approaching the poles? Where is this seen?

l225: Should you conclude the section with some statement?

Section 4.2:

l227-228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?

l229-230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).

l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!

l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.

l253: "log domain" : this figure appears to be on a linear scale?

l270: truely -> truly

Figure 11b: a legend for the various fitted curve would be very helpful.

l293: Why is it 3 times T_{int}?

l306: "Note that ..." : this should be moved earlier just after your eq 6.

l 315-319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.

Section 4.4:

ll323: If your method holds, should you not rather say that the model is biased low?

Data availability: A statement on the HYCOM data availability is missing.

Citation: https://doi.org/10.5194/egusphere-2022-1085-RC2
- AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
  
  We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
  General comments:
  This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
  I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning?
  
  Ans.: We wrongly assumed Gaussian distributed complex demodulates, hence our confidence intervals were incorrect. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
  
  I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
  
  Ans.: From our understanding, the analytic transform would not capture the amplitude at the M2 frequency without band-pass filtering the \eta_1000 time series a priori.
  
  For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
  - comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
  - comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
  - Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
  
  Ans.: Section 3 in itself is not about validating HYCOM, but introducing, using a local example, the methods used to further validate HYCOM (section 4). There are other papers describing strict Eulerian point-to-point comparisons between HYCOM and moorings (c.f., cited papers Ansong, 2017 and Luecke, 2020). Rather, the Eulerian component of our analysis is designed to bolster and extend the main Lagrangian component (added clarification at the end of section 1). Therefore, the logical beginning of section 3 is the methodology developed by Geoffroy and Nycander (2022) to estimate the variance of the semidiurnal IT using Lagrangian data. We then developed a Eulerian framework using the HYCOM data primarily to validate our Lagrangian methodology. Incidentally, it also enables the analysis of the decorrelation of the IT. This mirrors the organization of section 4.
  
  Comments: Abstract:
  line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "auto-correlation" or "auto-covariance" which are established statistical terms. An abstract should be able to stand alone. It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
  Ans.: Reworked abstract. Suppressed “decorrelate”.
  
  l46: It is not obvious (to me) what the k-space methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
  Ans.: Yes. Rephrased.
  
  l71-72: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
  Ans.: Corrected
  
  Section 2.3:
  Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
  Ans.: The reference for HYCOM (Chassignet et al., 2006) was given in the introduction, the first time we used the HYCOM acronym. We do not have any other reference for this particular simulation (apart from `GLBy190.04'). Added information on the Lagrangian simulation.
  
  l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
  Ans.: Suppressed “mainly”. Added paragraph at the beginning of the section regarding the data we use. The model wasn’t run specifically for this study, we used the data that were available and suitable for our methodology. Hence, there are no real technical constraints to mention.
  
  l101: Why 41644? Does this correspond to a mean geographical density?
  Ans.: Yes, it roughly corresponds to a mean density of 15 particles in our final 200 km radius circular patches. We feel this is unnecessary to be added, moreover it would not be easy to motivate clearly at this stage of the paper.
  
  l106: The effects of the drift? Do you mean potential Lagrangian biases?
  Ans.: As pointed at by referee #1, we prefer to call these effects “Lagrangian decorrelation”.
  
  Section 3.1:
  eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
  Ans.: Added explanations. We do correct for the float displacement for the Argo data .
  
  l113-115: which monthly-mean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
  Ans.: Added information. It is the modeled monthly-mean 3D temperature field introduced in section 2.3. For the Argo data, we compute the temperature gradient at 1000 dbar for a given park phase using the temperature profiles recorded by the float immediately before and after that park phase (now made clearer in section 3).
  
  Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
  Ans.: Figure 1 shows the median position of the Argo segments as dots (now made clear in the text). Added Argo trajectories.
  
  l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
  Ans.: Rephrased.
  
  l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
  Ans.: Here R_{argo} is the red curve. We do see it fall below its CI at ~200h.
  
  Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chi-square variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
  Ans.: Correct. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
  
  Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
  Ans.: Figure modified.
  
  l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
  Ans.: Also discussed by referee #1, modified the sentence.
  
  I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
  Ans.: Our complex demodulate method specifically selects the amplitude at the semidiurnal frequency. Our understanding is that unless band-pass filtering the time series prior to computing the autocovariance, the analytic transform cannot be used to isolate the amplitude of the oscillations at the semidiurnal frequency.
  
  l190: "outliers": please use sentence to explain what you mean.
  Ans.: Also pointed at by referee #1. Added sentence. \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added sentence.
  
  l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
  Ans.: Deleted “scatter”. There was an error in the script calculating R^2. The correct value in log domain is 0.98, in non-log domain it is 0.74. Our R^2 is (Pearson’s R)^2. Changed to r^2 to avoid any confusion with the autocovariance (denoted R).
  
  l198: "taken as ..." : state this earlier to remind the reader.
  Ans.: Added statement earlier in the text.
  
  l208: a bias which means that HYCOM underestimate Argo, correct?
  Ans.: Correct. Rephrased.
  
  Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
  Ans.: Replaced log10(x/y) by x/(x+y).
  
  l214: The ratio increases approaching the poles? Where is this seen?
  Ans.: The ratio itself is not shown, added comment.
  
  l225: Should you conclude the section with some statement?
  Ans.: Added sentence referring to the discussion on potential sources of biases.
  
  Section 4.2:
  l227-228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
  Ans.: Also noted by referee #1. Everywhere replaced "intrinsic decorrelation" by "decorrelation of the IT", or "decorrelation", and further "apparent decorrelation" by "Lagrangian decorrelation".
  
  l229-230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
  Ans.: Changed the proposition to “We compute a sample autocovariance in the Eulerian framework for each particle“. This refers to the sample autocovariance computed in our Eulerian framework as explained in section 3.3. The sample autocovariance in the Eulerian framework is computed along the particle’s trajectory using Eulerian data.
  
  l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
  Ans.: Agreed, by definition our demodulates are Rice distributed. Deleted “(and their demodulates)“. However, we emphasize that the sample mean autocovariance at a given time lag can be considered Gaussian distributed in the two following cases:
  - In a local geographical patch: we assume the particles are randomly sampling a wave field with uniform statistics.
  - When computing the sample mean autocovariance from a very large population of particles (global or regional scales), by virtue of the central limit theorem.
  
  l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
  Ans.: As noted by referee #1, as a ratio of sample statistics, SCVF_15 is expected to be noisy. We now plot the IT variance instead.
  
  l253: "log domain" : this figure appears to be on a linear scale?
  Ans.: Plot changed.
  
  l270: truely -> truly
  Ans.: Corrected
  
  Figure 11b: a legend for the various fitted curve would be very helpful.
  Ans.: Added legend
  
  l293: Why is it 3 times T_{int}?
  Ans.: For 95% of the exponential decay is achieved within 3 time constants (exp(-3)~0.05). Added comment.
  
  l306: "Note that ..." : this should be moved earlier just after your eq 6.
  Ans.: This remark explains the results of the fitting. We did not assume this when defining our model.
  
  l315-319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
  Ans.: Expanded on the implications for HYCOM.
  
  Section 4.4:
  l323: If your method holds, should you not rather say that the model is biased low?
  Ans.: Since we do not see any reasons for the in situ data/processing to be biased high, indeed our conclusion is that HYCOM is biased low.
  
  Data availability: A statement on the HYCOM data availability is missing.
  Ans.: Added statement
  
  Citation: https://doi.org/10.5194/egusphere-2022-1085-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1085', Anonymous Referee #1, 21 Nov 2022

The authors have made an interesting analysis of semidiurnal tides

in the HYCOM model using the time-lagged Eulerian and Lagrangian

autocovariances of vertical isotherm displacement. They compared output

from a 32-day-long HYCOM simulation with Argo park mode data and

moored thermistor data, and found that the "total internal tide"

variance in HYCOM is too small, especially in the far Southern Ocean.

They also found that the Eulerian and Lagrangian estimates of the

tidal variance at near-zero lag agree very well using HYCOM data, which

bolsters this analysis and their previous analysis of Argo data.

Finally, they used Caspar-Cohen's technique to estimate the "intrinsic"

and "apparent" decorrelation times of the internal tide, and found that

the observed tides decorrelate faster than the HYCOM tides. The authors

make the interesting observation that the mean (stationary) IT in

HYCOM is too large compared to altimetry, but the total IT

(stationary + nonstationary) is too small compared to Argo and moorings.

They discussed some reasons for the discrpancies between HYCOM and the

observations, but it was unclear if any of their suggestions could

explain the quantitative differences. With regard to the too stationary

tides in HYCOM, they did not mention the possible roles of missing

small-scale mesoscale or submesoscale variability in HYCOM or the deficit

of high-frequency wind forcing.

Overall, this is a nice piece of work which I think will be of interest

to many readers of Ocean Science. I have many small comments, listed

below. While I would say I have no major concerns, my comments could justify

some new analyses or revisions of results presented, so I recommend Major Revision.

Comments:

l1: Is "total" needed? Why not omit or say "tidal"?

Throughout the abstract, "total" is used, but it is not contrasted with

"partial" or another "non-total" quantity to understand what distinction is

implied by "total".

l11: "beams" -> "waves" or "beams of waves"

l17: Omit "at any given position", since later in the sentence you state that

you are referring specifically to "their generation site".

l19: "causes" -> "cause"

l23-l27: I think I understand what the authors' are getting at, but I found

the first three sentences confusing. When we look at the plot of an autocovariance,

such as is suggested by the first phrase, we would see the envelope of

autocovariance decay, and the coherent fraction of the signal

will dominate the autocovariance at long lag. It seems like this paragraph

may be muddling the ideas of what happens to the autocovariance

as a function of increasing lag, versus what happens to the wave

energy as a function of increasing propagation distance. I would suggest

re-thinking the purpose of this paragraph and re-writing it to more-clearly

articulate the point you wish to make.

l42: Once again, "total" is used without distinguishing it properly. It seems

like it should be clearly defined above, when the ideas of the coherent and

incoherent signals are defined.

l52: Finally, "the autocovariance at short time lags", is identified with the

"total variance". Some sort of explanation needs to be provided earlier.

But how is tidal variability distinguished from noise and high-frequency

ocean variability when looking at the "short time lags"?

l52: "On the other hand" -- I am not clear what is the other part of the

contrast. Omit this phrase?

l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought

to be one which is unaffected by extrinsic factors. But, the decorrelation is

entirely caused by interactions with the propagation medium. Perhaps it is

best to stick with the Eulerian vs Lagrangian distinction, and when the

autocovariance is discussed, it seems like you need to be clear whether

you are discussing an Eulerian or Lagrangian autocorrelation.

l63: omit "the strength of"

l73: "can vary" by how much?

l81: Did you use exactly the same datset as in Geoffroy and Nycander (2022)?

I would be interested to know how many 32-day records there are, from how

many individual drifters. Also, can you remind us exactly what the "data"

consist of? Is it time series of isopycnal displacement, inferred from

temperature measurements during the part phase, using temperature profiles from

the start and end?

l111: Is dT/dP in the numerator the same as dTbar/dP?

l114: "obtained" -> "estimated"

l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't

this a biased estimator when the expected value is taken for fixed N?

l140: I do not understand why the sine component is included. The autocovariance

is an even function, so any projection onto the sine must be noise, right?

Likewise, I don't understand the total error defined in equation (4). And

why would a robust estimator (median for \tilde{SEM}) be combined with a

non-robust estimator (Var A)?

l157: "not significantly different" -- Well, I agree that they do fall within

each others' standard errors, but they look significantly different to me.

What is the probability of the offset over so many different lags; how many

d.o.f. do you think are in these estimates?

l170: Can you explain why you estimated the Eulerian autocovariance along

the Lagrangian trajectory? Are you trying to account for the geographic

variability of the Eulerian autocovariance?

Fig 3: This is related to the above question: Why are the Eulerian error

bars so small compared to the Lagrangian? Maybe you could spend a little

more time explaining how this plot relates to Fig 2. Are the Lagrangian

HYCOM curves in Fig 2 identical to those in Fig 3?

Fig 4: Maybe use the same color for the HYCOM curves in each plot?

Is the red curve in Fig 3b the same as the black curve in Fig 4b? They

seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance,

but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b,

but R(740hr) > 10m^2 in Fig 4b).

l190: "for each particle" -> "for each HYCOM particle" ? But if this paragraph

applies to HYCOM, how can there be outliers?

l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory?

This would seem to use so many more degrees of freedom compared to the Lagrangian

estimates. I am not sure why this is done or why it would be justified.

Paragraph at line 195: This is a very good comparison and a little

surprising, to me.

Fig 6: Why are the maps drawn so small?

Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.

l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?

l225: Would it be fair to guess that there are also many Argo trajectories that

were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff

criterion? I wonder if you would see the difference in HYCOM vs Argo if

you made a Fig 8 based on the drift speed?

l227: Once again, "intrinsic" does not seem to be the right word.

l235: Is this an expected property of the Rayleigh distribution for the

pdf of the modulated wave amplitude? You might want to look into this in

the acoustics or optics literature. I don't believe this has been observed

previously for narrowband ocean internal waves.

l247: Why are you comparing the SCVF_15 statistic? This is a ratio of

sample statistics and likely to be very noisy. I don't really know what to

make of Fig 9a. With such a small dataset, I would like to see the ratio of

total variance (the demodulate amplitude at tau=48hr), instead.

l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the

demodulate amplitude at 48hr lag. Why not use that same quantity for

comparison? Oh -- I see it in Table 1.

l262: I am not sure what the "discrepancy" refers to.

l265: SCVF^{15} -> SCVF_{15}

l273: I don't think "no impact" is the correct way to characterize the previous

results. There is considerable scatter in Fig 5, and Fig 3b shows that the

estimates differ. Also, it is unclear to me why you don't try to make the

estimates more consistent by extrapolating the demodulate amplitude to zero lag.

l275: "mecanism" -> "mechanism"

l276: Why not just call it "Lagrangian decorrelation" instead of "apparent

decorrelation"? If I had been a reviewer on Gaspar-Cohen, I would have made the

same suggestion.

l306: "sinusoide" -> "sinusoid"

Table 2: It is interesting that the \omega_{AM} frequency corresponds to

M2-S2 beating, but the amplitude (\sigma^2_{AM}) does not.

l330-358: Modes discussion.

l364-377: Bathymetry discussion. Surely the importance of the errors depends

on the horizontal spatial scale of the errors. While this is interesting

discussion, it ought to consider the wavenumber spectrum of the error.

l386-401: Stratification.

None of these discussions really deal with the overall quantitative

difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or

0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems

in terms of their spatial coverage, but the Argo comparison seems much more

meaningful. I am unclear which of the authors' proposed sources of

bias could account for the 26% deficit compared to Argo.

l403: "run with" -> "run of"

l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of

the 0.74 value at l216? Have I misunderstood this?

l422: "stationnary" -> "stationary"; also, I think the "big O" notation should

be reserved for asymptotics, and here it is better to say "about" or

"approximately". Finally, I don't think "becomes stationary" is an appropriate

descriptor; this would be like saying a time series "becomes its mean".

l429: "unaffected" -> "unaffected in the mean"

l339: "supposedly account" -> "supposedly accounts"

l452: latex formatting needs help in the URL.

l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html

References: inconsistent capitalization is used in article titles

l557: "and contributors, T. P."

Citation: https://doi.org/10.5194/egusphere-2022-1085-RC1
- AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023
  
  We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
  General comments:
  The authors have made an interesting analysis of semidiurnal tides in the HYCOM model using the time-lagged Eulerian and Lagrangian autocovariances of vertical isotherm displacement. They compared output from a 32-day-long HYCOM simulation with Argo park mode data and moored thermistor data, and found that the "total internal tide" variance in HYCOM is too small, especially in the far Southern Ocean. They also found that the Eulerian and Lagrangian estimates of the tidal variance at near-zero lag agree very well using HYCOM data, which bolsters this analysis and their previous analysis of Argo data. Finally, they used Caspar-Cohen's technique to estimate the "intrinsic" and "apparent" decorrelation times of the internal tide, and found that the observed tides decorrelate faster than the HYCOM tides. The authors make the interesting observation that the mean (stationary) IT in HYCOM is too large compared to altimetry, but the total IT (stationary + nonstationary) is too small compared to Argo and moorings.
  They discussed some reasons for the discrepancies between HYCOM and the observations, but it was unclear if any of their suggestions could explain the quantitative differences. With regard to the too stationary tides in HYCOM, they did not mention the possible roles of missing small-scale mesoscale or submesoscale variability in HYCOM or the deficit of high-frequency wind forcing.
  Overall, this is a nice piece of work which I think will be of interest to many readers of Ocean Science. I have many small comments, listed below. While I would say I have no major concerns, my comments could justify some new analyses or revisions of results presented, so I recommend Major Revision.
  
  Comments:
  l1: Is "total" needed? Why not omit or say "tidal"? Throughout the abstract, "total" is used, but it is not contrasted with "partial" or another "non-total" quantity to understand what distinction is implied by "total".
  Ans.: Reworked abstract. Suppressed “total”.
  
  l11: "beams" -> "waves" or "beams of waves"
  Ans.: Corrected
  
  l17: Omit "at any given position", since later in the sentence you state that you are referring specifically to "their generation site".
  Ans.: The sentence is correct: the phase difference accounts for the propagation of the waves from the generation site to any given position, but it is constant in time.
  
  l19: "causes" -> "cause"
  Ans.: Corrected
  
  l23-l27: I think I understand what the authors' are getting at, but I found the first three sentences confusing. When we look at the plot of an autocovariance, such as is suggested by the first phrase, we would see the envelope of autocovariance decay, and the coherent fraction of the signal will dominate the autocovariance at long lag. It seems like this paragraph may be muddling the ideas of what happens to the autocovariance as a function of increasing lag, versus what happens to the wave energy as a function of increasing propagation distance. I would suggest re-thinking the purpose of this paragraph and re-writing it to more-clearly articulate the point you wish to make.
  Ans.: Reworked paragraph, re-centering it on the stationary/nonstationary wave field.
  
  l42: Once again, "total" is used without distinguishing it properly. It seems like it should be clearly defined above, when the ideas of the coherent and incoherent signals are defined.
  Ans.: “total” was defined earlier in the text (l.25). Added italic font.
  
  l52: Finally, "the autocovariance at short time lags", is identified with the "total variance". Some sort of explanation needs to be provided earlier. But how is tidal variability distinguished from noise and high-frequency ocean variability when looking at the "short time lags"?
  Ans.: “total” was defined earlier in the text (l.25). Provided short explanation on how noise is filtered out.
  
  l52: "On the other hand" -- I am not clear what is the other part of the contrast. Omit this phrase?
  Ans.: Contrasts the Lagrangian decorrelation downside with the fine time resolution upside.
  
  l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought to be one which is unaffected by extrinsic factors. But, the decorrelation is entirely caused by interactions with the propagation medium. Perhaps it is best to stick with the Eulerian vs Lagrangian distinction, and when the autocovariance is discussed, it seems like you need to be clear whether you are discussing an Eulerian or Lagrangian autocorrelation.
  Ans.: Agreed. Replaced "intrinsic decorrelation" by "decorrelation" or "decorrelation of the IT", and further "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
  
  l63: omit "the strength of"
  Ans.: Corrected
  
  l73: "can vary" by how much?
  Ans.: Replaced by “The sampling period of the park phase can occasionally vary by more than a few seconds.” The vast majority of the park phases we use have a sampling period of 1h. Very rarely do park phases have sampling periods significantly shorter (and even more rarely longer) than one hour.
  
  l81: Did you use exactly the same dataset as in Geoffroy and Nycander (2022)? I would be interested to know how many 32-day records there are, from how many individual drifters. Also, can you remind us exactly what the "data" consist of? Is it time series of isopycnal displacement, inferred from temperature measurements during the park phase, using temperature profiles from the start and end?
  Ans.: Added information at the beginning of section 2, and in section 2 and 3 (isotherms displacement is properly defined in section 3).
  
  l111: Is dT/dP in the numerator the same as dTbar/dP?
  Ans.: No: Tbar in the numerator is not a function of z, and they are calculated independently.
  
  l114: "obtained" -> "estimated"
  Ans.: Corrected
  
  l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't this a biased estimator when the expected value is taken for fixed N?
  Ans.: This estimator is unbiased: for increasingly large N, the estimator converges to the true value.
  
  l140: I do not understand why the sine component is included. The autocovariance is an even function, so any projection onto the sine must be noise, right ? [...]
  Ans.: Indeed, the acov is an even function. However, when two tidal constituents close in frequency are present, the autocovariance of the resulting beating can be expressed in terms of a cosine and a sine component (short derivation attached)
  […] Likewise, I don't understand the total error defined in equation (4). And why would a robust estimator (median for \tilde{SEM}) be combined with a non-robust estimator (Var A)?
  Ans.: The definition of the total error in equation (4) was not rigorous. Referee #2 also pointed at the wrong assumption of Gaussian statistics when computing the confidence interval of the complex demodulate. For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulate.
  
  l157: "not significantly different" -- Well, I agree that they do fall within each others' standard errors, but they look significantly different to me. What is the probability of the offset over so many different lags; how many d.o.f. do you think are in these estimates?
  Ans: Modified the sentence: “Apart from the first couple of demodulates, the HYCOM demodulate series appears consistently smaller than the Argo one.“ The number of d.o.f. is not straightforward to estimate, mainly because one first has to estimate how many independent values our 767-h records contain. Eq. (24) in Awe’s 1964 paper “Errors in correlation between time series” gives a way to compute such an estimate. However, since we do not know the underlying true autocovariance function, we cannot precisely evaluate this. Using the local mean autocovariance computed from Argo data for the local example shown in Fig. 2, we estimate that values separated by L~12 h may be considered independent of one another. Hence, when computing the mean autocovariance at a given \tau, we have roughly N_p*(N-\tau)/L d.o.f., with the number of 32-day records N_p=8 and the number of values in each record N=767. For small \tau, we have approximately 8*767/12 ~ 500 d.o.f. For the corresponding Lagrangian data from HYCOM we get ~250 d.o.f. (here L~39 h). Thus, they have largely enough d.o.f. to be considered different. The new confidence interval estimates reflect that conclusion.
  
  l170: Can you explain why you estimated the Eulerian autocovariance along the Lagrangian trajectory? Are you trying to account for the geographic variability of the Eulerian autocovariance? Ans.: Precisely. That way we are not introducing any discrepancy due to the (random) Lagrangian spatial sampling. Added explanations in the text.
  
  Fig 3: This is related to the above question: Why are the Eulerian errorbars so small compared to the Lagrangian? Maybe you could spend a little more time explaining how this plot relates to Fig 2. Are the Lagrangian HYCOM curves in Fig 2 identical to those in Fig 3?
  Ans.: The Eulerian mean autocovariance uses many more d.o.f. There are about 60 times more 32-day segments that are used to compute the Eulerian mean autocovariance resulting in about 4600 d.o.f. (L is also larger, i.e. the data less independent). Added explanations in the main text. Added in the caption of Fig. 3 that the Lagrangian HYCOM curves in Fig. 2 and 3 are identical.
  
  Fig 4: Maybe use the same color for the HYCOM curves in each plot? Is the red curve in Fig 3b the same as the black curve in Fig 4b? They seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance, but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b, but R(740hr) > 10m^2 in Fig 4b).
  Ans.: The color choice we made is to have the HYCOM Lagrangian data plotted in black, and the in situ data in red whenever possible. The red curve in Fig 3b should indeed be the same as the black curve in Fig 4b (Added text in the caption of Fig.4). There was an error in the plotting script.
  
  l190: "for each particle" -> "for each HYCOM particle" ? But if this paragraph applies to HYCOM, how can there be outliers?
  Ans.: Replaced "for each particle" by "for each HYCOM particle". \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added explanations in the text.
  
  l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory? This would seem to use so many more degrees of freedom compared to the Lagrangian estimates. I am not sure why this is done or why it would be justified. Paragraph at line 195: This is a very good comparison and a little surprising, to me.
  Ans.: Yes, the HYCOM Eulerian autocovariances are computed all along the trajectory subsampled every 12 h. As written before, the HYCOM Eulerian autocovariance estimates do use many more degrees of freedom. As a result the formal error is much smaller.
  
  Fig 6: Why are the maps drawn so small?
  Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
  Ans.: Enlarged Fig. 6, 7, 8, 9, 13, 14.
  
  l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
  Ans.: The ratio itself is not plotted (added as a comment in the main text), however we write its mean value and standard deviation in the text.
  
  l225: Would it be fair to guess that there are also many Argo trajectories that were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff criterion? I wonder if you would see the difference in HYCOM vs Argo if you made a Fig 8 based on the drift speed?
  Ans.: This criterion removes roughly 3000 Argo segments (~15%), but the final maps only get ~1% less coverage. This mostly affects the Argo data density in the regions east of Drake's passage, east of Algulas, and in the Equatorial Pacific. Attached is a figure similar to Fig. 8 but based on the mean drift speed: the Argo mean autocovariance remains more or less the same while the Lagrangian HYCOM autocovariance is significantly smaller for drift speeds smaller than 0.033 m s-1. Coincidentally, the HYCOM bins with a mean speed smaller than 0.033 m s-1 are mostly poleward of 50 deg S.
  l227: Once again, "intrinsic" does not seem to be the right word.
  Ans.: Deleted "intrinsic".
  
  l235: Is this an expected property of the Rayleigh distribution for the pdf of the modulated wave amplitude? You might want to look into this in the acoustics or optics literature. I don't believe this has been observed previously for narrowband ocean internal waves.
  Ans.: As pointed at by referee #2, the complex demodulates are not Gaussian distributed. Following, our definition of the complex demodulate, if the fitted C and S are Gaussian random variables with a same standard deviation, then A = sqrt(C^2+S^2) follows a Rice distribution. We did not give more thought about the expected distribution of the local mean autocovariance.
  
  l247: Why are you comparing the SCVF_15 statistic? This is a ratio of sample statistics and likely to be very noisy. I don't really know what to make of Fig 9a. With such a small dataset, I would like to see the ratio of total variance (the demodulate amplitude at tau=48hr), instead.
  Ans.: Indeed the SCVF_15 statistic from local mean autocovariance was very noisy. We now only compute it from larger populations of sample autocovariances. Fig 9 is now showing the variance instead of SCVF_15.
  
  l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the demodulate amplitude at 48hr lag. Why not use that same quantity for comparison? Oh -- I see it in Table 1.
  Ans.: -
  
  l262: I am not sure what the "discrepancy" refers to.
  Ans.: Poor writing, rewrote the sentence.
  
  l265: SCVF^{15} -> SCVF_{15}
  Ans.: Corrected
  
  l273: I don't think "no impact" is the correct way to characterize the previous results. There is considerable scatter in Fig 5, and Fig 3b shows that the estimates differ. Also, it is unclear to me why you don't try to make the estimates more consistent by extrapolating the demodulate amplitude to zero lag.
  Ans.: Replaced “no impact” by “no significant impact”. We do not know what is the true autocovariance function and how it behaves close to 0 time lag. The first demodulate is a conservative, simple, and robust estimate of the IT variance. It is also consistent within this work.
  
  l275: "mecanism" -> "mechanism"
  Ans.: Corrected
  
  l276: Why not just call it "Lagrangian decorrelation" instead of "apparent decorrelation"? If I had been a reviewer on Gaspar-Cohen, I would have made the same suggestion.
  Ans.: Replaced "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
  
  l306: "sinusoide" -> "sinusoid"
  Ans.: Corrected
  
  Table 2: It is interesting that the \omega_{AM} frequency corresponds to M2-S2 beating, but the amplitude (\sigma^2_{AM}) does not.
  Ans.: The M2-S2 beating, although probably dominating the semidiurnal amplitude modulation, is definitely not the only contribution to this amplitude modulation (other constituents close to M2 play a role).
  
  l330-358: Modes discussion.
  Ans.: -
  
  l364-377: Bathymetry discussion. Surely the importance of the errors depends on the horizontal spatial scale of the errors. While this is interesting discussion, it ought to consider the wavenumber spectrum of the error.
  Ans.: We do not have quantitative information on this error spectrum.
  
  l386-401: Stratification. None of these discussions really deal with the overall quantitative difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or 0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems in terms of their spatial coverage, but the Argo comparison seems much more meaningful. I am unclear which of the authors' proposed sources of bias could account for the 26% deficit compared to Argo.
  Ans.: Added summary sentence at the end of Sect. 4.
  
  l403: "run with" -> "run of"
  Ans.: Corrected
  
  l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of the 0.74 value at l216? Have I misunderstood this?
  Ans.: This factor of 1.5 was already mentioned l.225. Contrarily to the 0.74 value at l.216, it does include latitudes north of 50 deg N. Rewrote with factors consistent throughout the section. Added table 1 to summarize the statistics for the different groups.
  
  l422: "stationnary" -> "stationary"; also, I think the "big O" notation should be reserved for asymptotics, and here it is better to say "about" or "approximately". Finally, I don't think "becomes stationary" is an appropriate descriptor; this would be like saying a time series "becomes its mean".
  Ans.: Corrected. Replaced “the IT becomes stationary” by “the IT autocovariance reaches its stationary limit“.
  
  l429: "unaffected" -> "unaffected in the mean"
  Ans.: Corrected
  
  l339: "supposedly account" -> "supposedly accounts"
  Ans.: Corrected
  
  l452: latex formatting needs help in the URL.
  Ans.: Corrected
  
  l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html References: inconsistent capitalization is used in article titles
  Ans.: Corrected
  
  l557: "and contributors, T. P."
  Ans.: Corrected
  
  Citation: https://doi.org/10.5194/egusphere-2022-1085-AC1
RC2:
'Comment on egusphere-2022-1085', Anonymous Referee #2, 07 Dec 2022

General comments:

This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.

I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning? I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.

For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:

- comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?

- comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?

- Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?

Specific comments and technical corrections:

Abstract: line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "auto-correlation" or "auto-covariance" which are established statistical terms. An abstract should be able to stand alone.

It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.

l46: It is not obvious (to me) what the k-space methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?

l71-72: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?

Section 2.3: Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?

l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?

l101: Why 41644? Does this correspond to a mean geographical density?

l106: The effects of the drift? Do you mean potential Lagrangian biases?

Section 3.1; eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?

l113-115: which monthly-mean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.

Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?

l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?

l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?

Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chi-square variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.

Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.

l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.

I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).

l190: "outliers": please use sentence to explain what you mean.

l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.

l198: "taken as ..." : state this earlier to remind the reader.

l208: a bias which means that HYCOM underestimate Argo, correct?

Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.

l214: The ratio increases approaching the poles? Where is this seen?

l225: Should you conclude the section with some statement?

Section 4.2:

l227-228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?

l229-230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).

l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!

l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.

l253: "log domain" : this figure appears to be on a linear scale?

l270: truely -> truly

Figure 11b: a legend for the various fitted curve would be very helpful.

l293: Why is it 3 times T_{int}?

l306: "Note that ..." : this should be moved earlier just after your eq 6.

l 315-319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.

Section 4.4:

ll323: If your method holds, should you not rather say that the model is biased low?

Data availability: A statement on the HYCOM data availability is missing.

Citation: https://doi.org/10.5194/egusphere-2022-1085-RC2
- AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
  
  We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
  General comments:
  This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
  I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning?
  
  Ans.: We wrongly assumed Gaussian distributed complex demodulates, hence our confidence intervals were incorrect. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
  
  I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
  
  Ans.: From our understanding, the analytic transform would not capture the amplitude at the M2 frequency without band-pass filtering the \eta_1000 time series a priori.
  
  For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
  - comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
  - comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
  - Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
  
  Ans.: Section 3 in itself is not about validating HYCOM, but introducing, using a local example, the methods used to further validate HYCOM (section 4). There are other papers describing strict Eulerian point-to-point comparisons between HYCOM and moorings (c.f., cited papers Ansong, 2017 and Luecke, 2020). Rather, the Eulerian component of our analysis is designed to bolster and extend the main Lagrangian component (added clarification at the end of section 1). Therefore, the logical beginning of section 3 is the methodology developed by Geoffroy and Nycander (2022) to estimate the variance of the semidiurnal IT using Lagrangian data. We then developed a Eulerian framework using the HYCOM data primarily to validate our Lagrangian methodology. Incidentally, it also enables the analysis of the decorrelation of the IT. This mirrors the organization of section 4.
  
  Comments: Abstract:
  line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "auto-correlation" or "auto-covariance" which are established statistical terms. An abstract should be able to stand alone. It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
  Ans.: Reworked abstract. Suppressed “decorrelate”.
  
  l46: It is not obvious (to me) what the k-space methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
  Ans.: Yes. Rephrased.
  
  l71-72: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
  Ans.: Corrected
  
  Section 2.3:
  Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
  Ans.: The reference for HYCOM (Chassignet et al., 2006) was given in the introduction, the first time we used the HYCOM acronym. We do not have any other reference for this particular simulation (apart from `GLBy190.04'). Added information on the Lagrangian simulation.
  
  l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
  Ans.: Suppressed “mainly”. Added paragraph at the beginning of the section regarding the data we use. The model wasn’t run specifically for this study, we used the data that were available and suitable for our methodology. Hence, there are no real technical constraints to mention.
  
  l101: Why 41644? Does this correspond to a mean geographical density?
  Ans.: Yes, it roughly corresponds to a mean density of 15 particles in our final 200 km radius circular patches. We feel this is unnecessary to be added, moreover it would not be easy to motivate clearly at this stage of the paper.
  
  l106: The effects of the drift? Do you mean potential Lagrangian biases?
  Ans.: As pointed at by referee #1, we prefer to call these effects “Lagrangian decorrelation”.
  
  Section 3.1:
  eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
  Ans.: Added explanations. We do correct for the float displacement for the Argo data .
  
  l113-115: which monthly-mean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
  Ans.: Added information. It is the modeled monthly-mean 3D temperature field introduced in section 2.3. For the Argo data, we compute the temperature gradient at 1000 dbar for a given park phase using the temperature profiles recorded by the float immediately before and after that park phase (now made clearer in section 3).
  
  Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
  Ans.: Figure 1 shows the median position of the Argo segments as dots (now made clear in the text). Added Argo trajectories.
  
  l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
  Ans.: Rephrased.
  
  l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
  Ans.: Here R_{argo} is the red curve. We do see it fall below its CI at ~200h.
  
  Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chi-square variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
  Ans.: Correct. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
  
  Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
  Ans.: Figure modified.
  
  l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
  Ans.: Also discussed by referee #1, modified the sentence.
  
  I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
  Ans.: Our complex demodulate method specifically selects the amplitude at the semidiurnal frequency. Our understanding is that unless band-pass filtering the time series prior to computing the autocovariance, the analytic transform cannot be used to isolate the amplitude of the oscillations at the semidiurnal frequency.
  
  l190: "outliers": please use sentence to explain what you mean.
  Ans.: Also pointed at by referee #1. Added sentence. \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added sentence.
  
  l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
  Ans.: Deleted “scatter”. There was an error in the script calculating R^2. The correct value in log domain is 0.98, in non-log domain it is 0.74. Our R^2 is (Pearson’s R)^2. Changed to r^2 to avoid any confusion with the autocovariance (denoted R).
  
  l198: "taken as ..." : state this earlier to remind the reader.
  Ans.: Added statement earlier in the text.
  
  l208: a bias which means that HYCOM underestimate Argo, correct?
  Ans.: Correct. Rephrased.
  
  Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
  Ans.: Replaced log10(x/y) by x/(x+y).
  
  l214: The ratio increases approaching the poles? Where is this seen?
  Ans.: The ratio itself is not shown, added comment.
  
  l225: Should you conclude the section with some statement?
  Ans.: Added sentence referring to the discussion on potential sources of biases.
  
  Section 4.2:
  l227-228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
  Ans.: Also noted by referee #1. Everywhere replaced "intrinsic decorrelation" by "decorrelation of the IT", or "decorrelation", and further "apparent decorrelation" by "Lagrangian decorrelation".
  
  l229-230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
  Ans.: Changed the proposition to “We compute a sample autocovariance in the Eulerian framework for each particle“. This refers to the sample autocovariance computed in our Eulerian framework as explained in section 3.3. The sample autocovariance in the Eulerian framework is computed along the particle’s trajectory using Eulerian data.
  
  l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
  Ans.: Agreed, by definition our demodulates are Rice distributed. Deleted “(and their demodulates)“. However, we emphasize that the sample mean autocovariance at a given time lag can be considered Gaussian distributed in the two following cases:
  - In a local geographical patch: we assume the particles are randomly sampling a wave field with uniform statistics.
  - When computing the sample mean autocovariance from a very large population of particles (global or regional scales), by virtue of the central limit theorem.
  
  l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
  Ans.: As noted by referee #1, as a ratio of sample statistics, SCVF_15 is expected to be noisy. We now plot the IT variance instead.
  
  l253: "log domain" : this figure appears to be on a linear scale?
  Ans.: Plot changed.
  
  l270: truely -> truly
  Ans.: Corrected
  
  Figure 11b: a legend for the various fitted curve would be very helpful.
  Ans.: Added legend
  
  l293: Why is it 3 times T_{int}?
  Ans.: For 95% of the exponential decay is achieved within 3 time constants (exp(-3)~0.05). Added comment.
  
  l306: "Note that ..." : this should be moved earlier just after your eq 6.
  Ans.: This remark explains the results of the fitting. We did not assume this when defining our model.
  
  l315-319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
  Ans.: Expanded on the implications for HYCOM.
  
  Section 4.4:
  l323: If your method holds, should you not rather say that the model is biased low?
  Ans.: Since we do not see any reasons for the in situ data/processing to be biased high, indeed our conclusion is that HYCOM is biased low.
  
  Data availability: A statement on the HYCOM data availability is missing.
  Ans.: Added statement
  
  Citation: https://doi.org/10.5194/egusphere-2022-1085-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Gaspard Geoffroy on behalf of the Authors (10 Feb 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (10 Feb 2023) by Ilker Fer

RR by Anonymous Referee #1 (25 Feb 2023)

Suggestions for revision or reasons for rejection

egusphere-2022-1085 revision

They responded well to the comments of the reviewers. I appreciate their
efforts to present this material and to clarify my misunderstandings in the
previous review.

I still have a question about whether the non-tidal processes at short
lags, which apparently contain much more variance than the tidal processes
for the Argo and mooring data, can significantly bias the estimated amplitude
of the demodulate at 48-hr. Since the paper deals with amplitude and
amplitude ratios involving the 48-hr demodulate, it would be important
to quantify this effect. (See comments regarding Fig 8e+f, below.)

I would suggest "minor revision", but I trust that the authors and editor
can decide whether their reply to my remaining query requires another
full round of peer review or not.

l52: I'm not sure what "oceanic noise" refers to here.

l88: Please don't conflate approximation with asymptotics ("order of").

Fig 1: Can you please adjust the aspect ratio so that the circle looks
more circular?

l163: I still don't understand why the sine term is used in estimating
the amplitude envelope. The sample autocovariance is an even function,
and so is the true autocovariance. The reply to comments in the discussion
paper online mentions that a short derivation is attached, but I don't
see it.

l175: I don't understand why a "reasonable" estimate of the uncertainty of
the envelope is the estimated uncertainty of the oscillating function
itself. For example, if we had a constant-amplitude sinusoid plus
white noise, the uncertainty of the envelope would be much smaller
than the pointwise scatter of the sinusoidal curve, assuming the
time series length is much longer than the period of oscillation.

(Although -- after finishing the whole manuscript, the error bars don't
seem to play a significant role in any conclusions.)

Fig 2: The caption mentions that the zero-lag Argo variance is 121m^2.
If 25m^2 of this is due to internal tide, it leaves nearly 100m^2 as
"noise" variance -- due to instrument noise or rapidly-decorrelating
oceanographic signals. Perhaps this value is plausible for the broadband
internal wave variance, but I am not an expert. Could you comment on the
origin of the 100m^2 nontidal variance at zero lag? Is it comparable to
the value obtained from the mooring in Fig 4?

l223: "criteria" --> "criterion"

l223-224: I thought the latter criterion was already taken care of
by the low temperature gradient test. I don't understand why these two
tests (low temperature gradient and excessive eta_1000 variance) would
need to be applied.

Fig 8e+f: It is a little suspicious to me that the Argo envelope
for lags from about 50 to 150hr extrapolates to zero lag rather close
to the zero lag value of HYCOM. Is your procedure for estimating the
envelope validated for the case of a large "nugget" at zero lag? I wonder
if there is a bias issue connected with how the huge variance at
zero lag (in the Argo data) could "leak" into the demodulate amplitude
for small lags, somewhat analogous to spectral leakage? It might
be useful to look at these autocovariances in the spectral domain,
instead.

To make this question concrete: Suppose the Argo floats sample an
oceanographic process that decorrelates over 3 hours, but the
variance of this process is about 5 times larger than the variance
of the IT. How much will your estimate of IT variance, based on
the demodulate amplitude at 48hr, be affected by the
rapidly-decorrelating process? What if the rapid process decorrelates
over 12hr, instead?

l300: The spring-neap period is about 15 days, not twice this.

l301: "Note that ... process." omit? I don't understand this.
Why "strong/short" and "weak/long"? Isn't this exactly the distinction you
can make, i.e., you can't distinguish strong/long from weak/short?

Table 2 + Fig 10: Since these involve ratios of the IT estimated
at 48hr lag relative to the estimate at 15-d, it would be important to
understand the answer to my above question about how the large-variance
rapidly-decorrelating processes in the data might bias the estimate at
48-hr lag.

l391-421: It seems like this could be considerably shortened, emphasizing
the conclusions of the last paragraph.

l440: add "in the model." ?

l454: "coarser" --> "finer" ? Also -- shouldn't you use the HYCOM bathymetry,
rather than GEBCO bathymetry, to compute HYCOM modes? Although, I'd
be surprised if this made much difference.

Table 4: It might be interesting to compare the decorrelation timescale
T with the estimate in Zaron 2022, "Baroclinic tidal cusps from satellite altimetry." J. Phys. Oceanogr., 52(12):3123--3137, or mention why you don't think the results should be compared.

Hide

RR by Shane Elipot (09 Mar 2023)

ED: Publish subject to minor revisions (review by editor) (19 Mar 2023) by Ilker Fer

AR by Gaspard Geoffroy on behalf of the Authors (28 Apr 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (03 May 2023) by Ilker Fer

AR by Gaspard Geoffroy on behalf of the Authors (04 May 2023)

Journal article(s) based on this preprint

12 Jun 2023

Validating the spatial variability in the semidiurnal internal tide in a realistic global ocean simulation with Argo and mooring data

Gaspard Geoffroy, Jonas Nycander, Maarten C. Buijsman, Jay F. Shriver, and Brian K. Arbic

Ocean Sci., 19, 811–835, https://doi.org/10.5194/os-19-811-2023,https://doi.org/10.5194/os-19-811-2023, 2023

Short summary

Gaspard Geoffroy et al.

Viewed

Total article views: 430 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
286	129	15	430	10	5

HTML: 286
PDF: 129
XML: 15
Total: 430
BibTeX: 10
EndNote: 5

Views and downloads (calculated since 24 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	83	33	4	120
Nov 2022	78	21	4	103
Dec 2022	25	18	0	43
Jan 2023	21	13	4	38
Feb 2023	27	14	0	41
Mar 2023	24	9	0	33
Apr 2023	14	9	3	26
May 2023	10	6	0	16
Jun 2023	4	6	0	10

Cumulative views and downloads (calculated since 24 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	83	33	4	120
Nov 2022	78	21	4	103
Dec 2022	25	18	0	43
Jan 2023	21	13	4	38
Feb 2023	27	14	0	41
Mar 2023	24	9	0	33
Apr 2023	14	9	3	26
May 2023	10	6	0	16
Jun 2023	4	6	0	10

Viewed (geographical distribution)

Total article views: 437 (including HTML, PDF, and XML) Thereof 437 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 12 Jun 2023

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (6838 KB)
Metadata XML

Short summary

The ocean state is sensitive to the mixing originating from internal tides (IT). To date, our knowledge of the magnitude and spatial distribution of this mixing mostly relies on uncertain modeling. Here, we use novel observations from autonomous floats to validate the spatial variability of the semidiurnal IT in a realistic ocean simulation. The numerical simulation is found to correctly reproduce the main spatial patterns of the observed tidal energy, but to be biased low at the global scale.


Total:	0
HTML:	0
PDF:	0
XML:	0