the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Validating the spatial variability of the semidiurnal internal tide in a realistic global ocean simulation with Argo and mooring data
Gaspard Geoffroy
Jonas Nycander
Maarten C. Buijsman
Jay F. Shriver
Brian K. Arbic
Abstract. The total variance and decorrelation of the semidiurnal internal tide (IT) are examined in a 32day segment of a global run of the HYbrid Coordinate Ocean Model (HYCOM). This numerical simulation, with 41 vertical layers and 1/25 degree horizontal resolution, includes tidal and atmospheric forcing allowing for the generation and propagation of IT to take place within a realistic eddying general circulation. The HYCOM data are in turn compared with global observations of the IT around 1,000 dbar, from Argo float park phase data and mooring records. HYCOM is found to be globally biased low in terms of total variance and decorrelation of the semidiurnal IT over timescales shorter than 32 days. Except in the Southern Ocean, where limitations in the model causes the discrepancy with in situ measurements to grow poleward, the spatial correlation between the Argo and HYCOM inferred total variance suggests that the generation of lowmode semidiurnal IT is globally well captured by the model.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(6838 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(6838 KB)  BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Gaspard Geoffroy et al.
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221085', Anonymous Referee #1, 21 Nov 2022
The authors have made an interesting analysis of semidiurnal tides
in the HYCOM model using the timelagged Eulerian and Lagrangian
autocovariances of vertical isotherm displacement. They compared output
from a 32daylong HYCOM simulation with Argo park mode data and
moored thermistor data, and found that the "total internal tide"
variance in HYCOM is too small, especially in the far Southern Ocean.
They also found that the Eulerian and Lagrangian estimates of the
tidal variance at nearzero lag agree very well using HYCOM data, which
bolsters this analysis and their previous analysis of Argo data.
Finally, they used CasparCohen's technique to estimate the "intrinsic"
and "apparent" decorrelation times of the internal tide, and found that
the observed tides decorrelate faster than the HYCOM tides. The authors
make the interesting observation that the mean (stationary) IT in
HYCOM is too large compared to altimetry, but the total IT
(stationary + nonstationary) is too small compared to Argo and moorings.They discussed some reasons for the discrpancies between HYCOM and the
observations, but it was unclear if any of their suggestions could
explain the quantitative differences. With regard to the too stationary
tides in HYCOM, they did not mention the possible roles of missing
smallscale mesoscale or submesoscale variability in HYCOM or the deficit
of highfrequency wind forcing.Overall, this is a nice piece of work which I think will be of interest
to many readers of Ocean Science. I have many small comments, listed
below. While I would say I have no major concerns, my comments could justify
some new analyses or revisions of results presented, so I recommend Major Revision.Comments:
l1: Is "total" needed? Why not omit or say "tidal"?
Throughout the abstract, "total" is used, but it is not contrasted with
"partial" or another "nontotal" quantity to understand what distinction is
implied by "total".l11: "beams" > "waves" or "beams of waves"
l17: Omit "at any given position", since later in the sentence you state that
you are referring specifically to "their generation site".l19: "causes" > "cause"
l23l27: I think I understand what the authors' are getting at, but I found
the first three sentences confusing. When we look at the plot of an autocovariance,
such as is suggested by the first phrase, we would see the envelope of
autocovariance decay, and the coherent fraction of the signal
will dominate the autocovariance at long lag. It seems like this paragraph
may be muddling the ideas of what happens to the autocovariance
as a function of increasing lag, versus what happens to the wave
energy as a function of increasing propagation distance. I would suggest
rethinking the purpose of this paragraph and rewriting it to moreclearly
articulate the point you wish to make.l42: Once again, "total" is used without distinguishing it properly. It seems
like it should be clearly defined above, when the ideas of the coherent and
incoherent signals are defined.l52: Finally, "the autocovariance at short time lags", is identified with the
"total variance". Some sort of explanation needs to be provided earlier.
But how is tidal variability distinguished from noise and highfrequency
ocean variability when looking at the "short time lags"?l52: "On the other hand"  I am not clear what is the other part of the
contrast. Omit this phrase?l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought
to be one which is unaffected by extrinsic factors. But, the decorrelation is
entirely caused by interactions with the propagation medium. Perhaps it is
best to stick with the Eulerian vs Lagrangian distinction, and when the
autocovariance is discussed, it seems like you need to be clear whether
you are discussing an Eulerian or Lagrangian autocorrelation.l63: omit "the strength of"
l73: "can vary" by how much?
l81: Did you use exactly the same datset as in Geoffroy and Nycander (2022)?
I would be interested to know how many 32day records there are, from how
many individual drifters. Also, can you remind us exactly what the "data"
consist of? Is it time series of isopycnal displacement, inferred from
temperature measurements during the part phase, using temperature profiles from
the start and end?l111: Is dT/dP in the numerator the same as dTbar/dP?
l114: "obtained" > "estimated"
l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't
this a biased estimator when the expected value is taken for fixed N?l140: I do not understand why the sine component is included. The autocovariance
is an even function, so any projection onto the sine must be noise, right?
Likewise, I don't understand the total error defined in equation (4). And
why would a robust estimator (median for \tilde{SEM}) be combined with a
nonrobust estimator (Var A)?l157: "not significantly different"  Well, I agree that they do fall within
each others' standard errors, but they look significantly different to me.
What is the probability of the offset over so many different lags; how many
d.o.f. do you think are in these estimates?l170: Can you explain why you estimated the Eulerian autocovariance along
the Lagrangian trajectory? Are you trying to account for the geographic
variability of the Eulerian autocovariance?Fig 3: This is related to the above question: Why are the Eulerian error
bars so small compared to the Lagrangian? Maybe you could spend a little
more time explaining how this plot relates to Fig 2. Are the Lagrangian
HYCOM curves in Fig 2 identical to those in Fig 3?Fig 4: Maybe use the same color for the HYCOM curves in each plot?
Is the red curve in Fig 3b the same as the black curve in Fig 4b? They
seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance,
but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b,
but R(740hr) > 10m^2 in Fig 4b).l190: "for each particle" > "for each HYCOM particle" ? But if this paragraph
applies to HYCOM, how can there be outliers?l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory?
This would seem to use so many more degrees of freedom compared to the Lagrangian
estimates. I am not sure why this is done or why it would be justified.Paragraph at line 195: This is a very good comparison and a little
surprising, to me.Fig 6: Why are the maps drawn so small?
Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
l225: Would it be fair to guess that there are also many Argo trajectories that
were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff
criterion? I wonder if you would see the difference in HYCOM vs Argo if
you made a Fig 8 based on the drift speed?l227: Once again, "intrinsic" does not seem to be the right word.
l235: Is this an expected property of the Rayleigh distribution for the
pdf of the modulated wave amplitude? You might want to look into this in
the acoustics or optics literature. I don't believe this has been observed
previously for narrowband ocean internal waves.l247: Why are you comparing the SCVF_15 statistic? This is a ratio of
sample statistics and likely to be very noisy. I don't really know what to
make of Fig 9a. With such a small dataset, I would like to see the ratio of
total variance (the demodulate amplitude at tau=48hr), instead.l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the
demodulate amplitude at 48hr lag. Why not use that same quantity for
comparison? Oh  I see it in Table 1.l262: I am not sure what the "discrepancy" refers to.
l265: SCVF^{15} > SCVF_{15}
l273: I don't think "no impact" is the correct way to characterize the previous
results. There is considerable scatter in Fig 5, and Fig 3b shows that the
estimates differ. Also, it is unclear to me why you don't try to make the
estimates more consistent by extrapolating the demodulate amplitude to zero lag.l275: "mecanism" > "mechanism"
l276: Why not just call it "Lagrangian decorrelation" instead of "apparent
decorrelation"? If I had been a reviewer on GasparCohen, I would have made the
same suggestion.l306: "sinusoide" > "sinusoid"
Table 2: It is interesting that the \omega_{AM} frequency corresponds to
M2S2 beating, but the amplitude (\sigma^2_{AM}) does not.l330358: Modes discussion.
l364377: Bathymetry discussion. Surely the importance of the errors depends
on the horizontal spatial scale of the errors. While this is interesting
discussion, it ought to consider the wavenumber spectrum of the error.l386401: Stratification.
None of these discussions really deal with the overall quantitative
difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or
0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems
in terms of their spatial coverage, but the Argo comparison seems much more
meaningful. I am unclear which of the authors' proposed sources of
bias could account for the 26% deficit compared to Argo.l403: "run with" > "run of"
l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of
the 0.74 value at l216? Have I misunderstood this?l422: "stationnary" > "stationary"; also, I think the "big O" notation should
be reserved for asymptotics, and here it is better to say "about" or
"approximately". Finally, I don't think "becomes stationary" is an appropriate
descriptor; this would be like saying a time series "becomes its mean".l429: "unaffected" > "unaffected in the mean"
l339: "supposedly account" > "supposedly accounts"
l452: latex formatting needs help in the URL.
l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html
References: inconsistent capitalization is used in article titles
l557: "and contributors, T. P."
Citation: https://doi.org/10.5194/egusphere20221085RC1 
AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023
We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
General comments:
The authors have made an interesting analysis of semidiurnal tides in the HYCOM model using the timelagged Eulerian and Lagrangian autocovariances of vertical isotherm displacement. They compared output from a 32daylong HYCOM simulation with Argo park mode data and moored thermistor data, and found that the "total internal tide" variance in HYCOM is too small, especially in the far Southern Ocean. They also found that the Eulerian and Lagrangian estimates of the tidal variance at nearzero lag agree very well using HYCOM data, which bolsters this analysis and their previous analysis of Argo data. Finally, they used CasparCohen's technique to estimate the "intrinsic" and "apparent" decorrelation times of the internal tide, and found that the observed tides decorrelate faster than the HYCOM tides. The authors make the interesting observation that the mean (stationary) IT in HYCOM is too large compared to altimetry, but the total IT (stationary + nonstationary) is too small compared to Argo and moorings.
They discussed some reasons for the discrepancies between HYCOM and the observations, but it was unclear if any of their suggestions could explain the quantitative differences. With regard to the too stationary tides in HYCOM, they did not mention the possible roles of missing smallscale mesoscale or submesoscale variability in HYCOM or the deficit of highfrequency wind forcing.
Overall, this is a nice piece of work which I think will be of interest to many readers of Ocean Science. I have many small comments, listed below. While I would say I have no major concerns, my comments could justify some new analyses or revisions of results presented, so I recommend Major Revision.
Comments:
l1: Is "total" needed? Why not omit or say "tidal"? Throughout the abstract, "total" is used, but it is not contrasted with "partial" or another "nontotal" quantity to understand what distinction is implied by "total".
Ans.: Reworked abstract. Suppressed “total”.
l11: "beams" > "waves" or "beams of waves"
Ans.: Corrected
l17: Omit "at any given position", since later in the sentence you state that you are referring specifically to "their generation site".
Ans.: The sentence is correct: the phase difference accounts for the propagation of the waves from the generation site to any given position, but it is constant in time.
l19: "causes" > "cause"
Ans.: Corrected
l23l27: I think I understand what the authors' are getting at, but I found the first three sentences confusing. When we look at the plot of an autocovariance, such as is suggested by the first phrase, we would see the envelope of autocovariance decay, and the coherent fraction of the signal will dominate the autocovariance at long lag. It seems like this paragraph may be muddling the ideas of what happens to the autocovariance as a function of increasing lag, versus what happens to the wave energy as a function of increasing propagation distance. I would suggest rethinking the purpose of this paragraph and rewriting it to moreclearly articulate the point you wish to make.
Ans.: Reworked paragraph, recentering it on the stationary/nonstationary wave field.
l42: Once again, "total" is used without distinguishing it properly. It seems like it should be clearly defined above, when the ideas of the coherent and incoherent signals are defined.
Ans.: “total” was defined earlier in the text (l.25). Added italic font.
l52: Finally, "the autocovariance at short time lags", is identified with the "total variance". Some sort of explanation needs to be provided earlier. But how is tidal variability distinguished from noise and highfrequency ocean variability when looking at the "short time lags"?
Ans.: “total” was defined earlier in the text (l.25). Provided short explanation on how noise is filtered out.
l52: "On the other hand"  I am not clear what is the other part of the contrast. Omit this phrase?
Ans.: Contrasts the Lagrangian decorrelation downside with the fine time resolution upside.
l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought to be one which is unaffected by extrinsic factors. But, the decorrelation is entirely caused by interactions with the propagation medium. Perhaps it is best to stick with the Eulerian vs Lagrangian distinction, and when the autocovariance is discussed, it seems like you need to be clear whether you are discussing an Eulerian or Lagrangian autocorrelation.
Ans.: Agreed. Replaced "intrinsic decorrelation" by "decorrelation" or "decorrelation of the IT", and further "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
l63: omit "the strength of"
Ans.: Corrected
l73: "can vary" by how much?
Ans.: Replaced by “The sampling period of the park phase can occasionally vary by more than a few seconds.” The vast majority of the park phases we use have a sampling period of 1h. Very rarely do park phases have sampling periods significantly shorter (and even more rarely longer) than one hour.
l81: Did you use exactly the same dataset as in Geoffroy and Nycander (2022)? I would be interested to know how many 32day records there are, from how many individual drifters. Also, can you remind us exactly what the "data" consist of? Is it time series of isopycnal displacement, inferred from temperature measurements during the park phase, using temperature profiles from the start and end?
Ans.: Added information at the beginning of section 2, and in section 2 and 3 (isotherms displacement is properly defined in section 3).
l111: Is dT/dP in the numerator the same as dTbar/dP?
Ans.: No: Tbar in the numerator is not a function of z, and they are calculated independently.
l114: "obtained" > "estimated"
Ans.: Corrected
l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't this a biased estimator when the expected value is taken for fixed N?
Ans.: This estimator is unbiased: for increasingly large N, the estimator converges to the true value.
l140: I do not understand why the sine component is included. The autocovariance is an even function, so any projection onto the sine must be noise, right ? [...]
Ans.: Indeed, the acov is an even function. However, when two tidal constituents close in frequency are present, the autocovariance of the resulting beating can be expressed in terms of a cosine and a sine component (short derivation attached)
[…] Likewise, I don't understand the total error defined in equation (4). And why would a robust estimator (median for \tilde{SEM}) be combined with a nonrobust estimator (Var A)?
Ans.: The definition of the total error in equation (4) was not rigorous. Referee #2 also pointed at the wrong assumption of Gaussian statistics when computing the confidence interval of the complex demodulate. For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulate.
l157: "not significantly different"  Well, I agree that they do fall within each others' standard errors, but they look significantly different to me. What is the probability of the offset over so many different lags; how many d.o.f. do you think are in these estimates?
Ans: Modified the sentence: “Apart from the first couple of demodulates, the HYCOM demodulate series appears consistently smaller than the Argo one.“ The number of d.o.f. is not straightforward to estimate, mainly because one first has to estimate how many independent values our 767h records contain. Eq. (24) in Awe’s 1964 paper “Errors in correlation between time series” gives a way to compute such an estimate. However, since we do not know the underlying true autocovariance function, we cannot precisely evaluate this. Using the local mean autocovariance computed from Argo data for the local example shown in Fig. 2, we estimate that values separated by L~12 h may be considered independent of one another. Hence, when computing the mean autocovariance at a given \tau, we have roughly N_p*(N\tau)/L d.o.f., with the number of 32day records N_p=8 and the number of values in each record N=767. For small \tau, we have approximately 8*767/12 ~ 500 d.o.f. For the corresponding Lagrangian data from HYCOM we get ~250 d.o.f. (here L~39 h). Thus, they have largely enough d.o.f. to be considered different. The new confidence interval estimates reflect that conclusion.
l170: Can you explain why you estimated the Eulerian autocovariance along the Lagrangian trajectory? Are you trying to account for the geographic variability of the Eulerian autocovariance? Ans.: Precisely. That way we are not introducing any discrepancy due to the (random) Lagrangian spatial sampling. Added explanations in the text.
Fig 3: This is related to the above question: Why are the Eulerian errorbars so small compared to the Lagrangian? Maybe you could spend a little more time explaining how this plot relates to Fig 2. Are the Lagrangian HYCOM curves in Fig 2 identical to those in Fig 3?
Ans.: The Eulerian mean autocovariance uses many more d.o.f. There are about 60 times more 32day segments that are used to compute the Eulerian mean autocovariance resulting in about 4600 d.o.f. (L is also larger, i.e. the data less independent). Added explanations in the main text. Added in the caption of Fig. 3 that the Lagrangian HYCOM curves in Fig. 2 and 3 are identical.
Fig 4: Maybe use the same color for the HYCOM curves in each plot? Is the red curve in Fig 3b the same as the black curve in Fig 4b? They seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance, but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b, but R(740hr) > 10m^2 in Fig 4b).
Ans.: The color choice we made is to have the HYCOM Lagrangian data plotted in black, and the in situ data in red whenever possible. The red curve in Fig 3b should indeed be the same as the black curve in Fig 4b (Added text in the caption of Fig.4). There was an error in the plotting script.
l190: "for each particle" > "for each HYCOM particle" ? But if this paragraph applies to HYCOM, how can there be outliers?
Ans.: Replaced "for each particle" by "for each HYCOM particle". \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added explanations in the text.
l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory? This would seem to use so many more degrees of freedom compared to the Lagrangian estimates. I am not sure why this is done or why it would be justified. Paragraph at line 195: This is a very good comparison and a little surprising, to me.
Ans.: Yes, the HYCOM Eulerian autocovariances are computed all along the trajectory subsampled every 12 h. As written before, the HYCOM Eulerian autocovariance estimates do use many more degrees of freedom. As a result the formal error is much smaller.
Fig 6: Why are the maps drawn so small?
Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
Ans.: Enlarged Fig. 6, 7, 8, 9, 13, 14.
l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
Ans.: The ratio itself is not plotted (added as a comment in the main text), however we write its mean value and standard deviation in the text.
l225: Would it be fair to guess that there are also many Argo trajectories that were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff criterion? I wonder if you would see the difference in HYCOM vs Argo if you made a Fig 8 based on the drift speed?
Ans.: This criterion removes roughly 3000 Argo segments (~15%), but the final maps only get ~1% less coverage. This mostly affects the Argo data density in the regions east of Drake's passage, east of Algulas, and in the Equatorial Pacific. Attached is a figure similar to Fig. 8 but based on the mean drift speed: the Argo mean autocovariance remains more or less the same while the Lagrangian HYCOM autocovariance is significantly smaller for drift speeds smaller than 0.033 m s1. Coincidentally, the HYCOM bins with a mean speed smaller than 0.033 m s1 are mostly poleward of 50 deg S.
l227: Once again, "intrinsic" does not seem to be the right word.
Ans.: Deleted "intrinsic".
l235: Is this an expected property of the Rayleigh distribution for the pdf of the modulated wave amplitude? You might want to look into this in the acoustics or optics literature. I don't believe this has been observed previously for narrowband ocean internal waves.
Ans.: As pointed at by referee #2, the complex demodulates are not Gaussian distributed. Following, our definition of the complex demodulate, if the fitted C and S are Gaussian random variables with a same standard deviation, then A = sqrt(C^2+S^2) follows a Rice distribution. We did not give more thought about the expected distribution of the local mean autocovariance.
l247: Why are you comparing the SCVF_15 statistic? This is a ratio of sample statistics and likely to be very noisy. I don't really know what to make of Fig 9a. With such a small dataset, I would like to see the ratio of total variance (the demodulate amplitude at tau=48hr), instead.
Ans.: Indeed the SCVF_15 statistic from local mean autocovariance was very noisy. We now only compute it from larger populations of sample autocovariances. Fig 9 is now showing the variance instead of SCVF_15.
l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the demodulate amplitude at 48hr lag. Why not use that same quantity for comparison? Oh  I see it in Table 1.
Ans.: 
l262: I am not sure what the "discrepancy" refers to.
Ans.: Poor writing, rewrote the sentence.
l265: SCVF^{15} > SCVF_{15}
Ans.: Corrected
l273: I don't think "no impact" is the correct way to characterize the previous results. There is considerable scatter in Fig 5, and Fig 3b shows that the estimates differ. Also, it is unclear to me why you don't try to make the estimates more consistent by extrapolating the demodulate amplitude to zero lag.
Ans.: Replaced “no impact” by “no significant impact”. We do not know what is the true autocovariance function and how it behaves close to 0 time lag. The first demodulate is a conservative, simple, and robust estimate of the IT variance. It is also consistent within this work.
l275: "mecanism" > "mechanism"
Ans.: Corrected
l276: Why not just call it "Lagrangian decorrelation" instead of "apparent decorrelation"? If I had been a reviewer on GasparCohen, I would have made the same suggestion.
Ans.: Replaced "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
l306: "sinusoide" > "sinusoid"
Ans.: Corrected
Table 2: It is interesting that the \omega_{AM} frequency corresponds to M2S2 beating, but the amplitude (\sigma^2_{AM}) does not.
Ans.: The M2S2 beating, although probably dominating the semidiurnal amplitude modulation, is definitely not the only contribution to this amplitude modulation (other constituents close to M2 play a role).
l330358: Modes discussion.
Ans.: 
l364377: Bathymetry discussion. Surely the importance of the errors depends on the horizontal spatial scale of the errors. While this is interesting discussion, it ought to consider the wavenumber spectrum of the error.
Ans.: We do not have quantitative information on this error spectrum.
l386401: Stratification. None of these discussions really deal with the overall quantitative difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or 0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems in terms of their spatial coverage, but the Argo comparison seems much more meaningful. I am unclear which of the authors' proposed sources of bias could account for the 26% deficit compared to Argo.
Ans.: Added summary sentence at the end of Sect. 4.
l403: "run with" > "run of"
Ans.: Corrected
l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of the 0.74 value at l216? Have I misunderstood this?
Ans.: This factor of 1.5 was already mentioned l.225. Contrarily to the 0.74 value at l.216, it does include latitudes north of 50 deg N. Rewrote with factors consistent throughout the section. Added table 1 to summarize the statistics for the different groups.
l422: "stationnary" > "stationary"; also, I think the "big O" notation should be reserved for asymptotics, and here it is better to say "about" or "approximately". Finally, I don't think "becomes stationary" is an appropriate descriptor; this would be like saying a time series "becomes its mean".
Ans.: Corrected. Replaced “the IT becomes stationary” by “the IT autocovariance reaches its stationary limit“.
l429: "unaffected" > "unaffected in the mean"
Ans.: Corrected
l339: "supposedly account" > "supposedly accounts"
Ans.: Corrected
l452: latex formatting needs help in the URL.
Ans.: Corrected
l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html References: inconsistent capitalization is used in article titles
Ans.: Corrected
l557: "and contributors, T. P."
Ans.: Corrected

AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023

RC2: 'Comment on egusphere20221085', Anonymous Referee #2, 07 Dec 2022
General comments:
This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning? I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
 comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
 comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
 Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
Specific comments and technical corrections:Abstract: line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "autocorrelation" or "autocovariance" which are established statistical terms. An abstract should be able to stand alone.
It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
l46: It is not obvious (to me) what the kspace methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
l7172: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
Section 2.3: Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
l101: Why 41644? Does this correspond to a mean geographical density?
l106: The effects of the drift? Do you mean potential Lagrangian biases?
Section 3.1; eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
l113115: which monthlymean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chisquare variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
l190: "outliers": please use sentence to explain what you mean.
l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
l198: "taken as ..." : state this earlier to remind the reader.
l208: a bias which means that HYCOM underestimate Argo, correct?
Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
l214: The ratio increases approaching the poles? Where is this seen?
l225: Should you conclude the section with some statement?
Section 4.2:
l227228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
l229230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
l253: "log domain" : this figure appears to be on a linear scale?
l270: truely > truly
Figure 11b: a legend for the various fitted curve would be very helpful.
l293: Why is it 3 times T_{int}?
l306: "Note that ..." : this should be moved earlier just after your eq 6.
l 315319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
Section 4.4:
ll323: If your method holds, should you not rather say that the model is biased low?
Data availability: A statement on the HYCOM data availability is missing.
Citation: https://doi.org/10.5194/egusphere20221085RC2 
AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
General comments:
This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning?
Ans.: We wrongly assumed Gaussian distributed complex demodulates, hence our confidence intervals were incorrect. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
Ans.: From our understanding, the analytic transform would not capture the amplitude at the M2 frequency without bandpass filtering the \eta_1000 time series a priori.
For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
 comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
 comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
 Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
Ans.: Section 3 in itself is not about validating HYCOM, but introducing, using a local example, the methods used to further validate HYCOM (section 4). There are other papers describing strict Eulerian pointtopoint comparisons between HYCOM and moorings (c.f., cited papers Ansong, 2017 and Luecke, 2020). Rather, the Eulerian component of our analysis is designed to bolster and extend the main Lagrangian component (added clarification at the end of section 1). Therefore, the logical beginning of section 3 is the methodology developed by Geoffroy and Nycander (2022) to estimate the variance of the semidiurnal IT using Lagrangian data. We then developed a Eulerian framework using the HYCOM data primarily to validate our Lagrangian methodology. Incidentally, it also enables the analysis of the decorrelation of the IT. This mirrors the organization of section 4.
Comments: Abstract:
line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "autocorrelation" or "autocovariance" which are established statistical terms. An abstract should be able to stand alone. It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
Ans.: Reworked abstract. Suppressed “decorrelate”.
l46: It is not obvious (to me) what the kspace methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
Ans.: Yes. Rephrased.
l7172: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
Ans.: Corrected
Section 2.3:
Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
Ans.: The reference for HYCOM (Chassignet et al., 2006) was given in the introduction, the first time we used the HYCOM acronym. We do not have any other reference for this particular simulation (apart from `GLBy190.04'). Added information on the Lagrangian simulation.
l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
Ans.: Suppressed “mainly”. Added paragraph at the beginning of the section regarding the data we use. The model wasn’t run specifically for this study, we used the data that were available and suitable for our methodology. Hence, there are no real technical constraints to mention.
l101: Why 41644? Does this correspond to a mean geographical density?
Ans.: Yes, it roughly corresponds to a mean density of 15 particles in our final 200 km radius circular patches. We feel this is unnecessary to be added, moreover it would not be easy to motivate clearly at this stage of the paper.
l106: The effects of the drift? Do you mean potential Lagrangian biases?
Ans.: As pointed at by referee #1, we prefer to call these effects “Lagrangian decorrelation”.
Section 3.1:
eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
Ans.: Added explanations. We do correct for the float displacement for the Argo data .
l113115: which monthlymean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
Ans.: Added information. It is the modeled monthlymean 3D temperature field introduced in section 2.3. For the Argo data, we compute the temperature gradient at 1000 dbar for a given park phase using the temperature profiles recorded by the float immediately before and after that park phase (now made clearer in section 3).
Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
Ans.: Figure 1 shows the median position of the Argo segments as dots (now made clear in the text). Added Argo trajectories.
l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
Ans.: Rephrased.
l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
Ans.: Here R_{argo} is the red curve. We do see it fall below its CI at ~200h.
Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chisquare variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
Ans.: Correct. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
Ans.: Figure modified.
l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
Ans.: Also discussed by referee #1, modified the sentence.
I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
Ans.: Our complex demodulate method specifically selects the amplitude at the semidiurnal frequency. Our understanding is that unless bandpass filtering the time series prior to computing the autocovariance, the analytic transform cannot be used to isolate the amplitude of the oscillations at the semidiurnal frequency.
l190: "outliers": please use sentence to explain what you mean.
Ans.: Also pointed at by referee #1. Added sentence. \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added sentence.
l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
Ans.: Deleted “scatter”. There was an error in the script calculating R^2. The correct value in log domain is 0.98, in nonlog domain it is 0.74. Our R^2 is (Pearson’s R)^2. Changed to r^2 to avoid any confusion with the autocovariance (denoted R).
l198: "taken as ..." : state this earlier to remind the reader.
Ans.: Added statement earlier in the text.
l208: a bias which means that HYCOM underestimate Argo, correct?
Ans.: Correct. Rephrased.
Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
Ans.: Replaced log10(x/y) by x/(x+y).
l214: The ratio increases approaching the poles? Where is this seen?
Ans.: The ratio itself is not shown, added comment.
l225: Should you conclude the section with some statement?
Ans.: Added sentence referring to the discussion on potential sources of biases.
Section 4.2:
l227228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
Ans.: Also noted by referee #1. Everywhere replaced "intrinsic decorrelation" by "decorrelation of the IT", or "decorrelation", and further "apparent decorrelation" by "Lagrangian decorrelation".
l229230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
Ans.: Changed the proposition to “We compute a sample autocovariance in the Eulerian framework for each particle“. This refers to the sample autocovariance computed in our Eulerian framework as explained in section 3.3. The sample autocovariance in the Eulerian framework is computed along the particle’s trajectory using Eulerian data.
l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
Ans.: Agreed, by definition our demodulates are Rice distributed. Deleted “(and their demodulates)“. However, we emphasize that the sample mean autocovariance at a given time lag can be considered Gaussian distributed in the two following cases:
 In a local geographical patch: we assume the particles are randomly sampling a wave field with uniform statistics.
 When computing the sample mean autocovariance from a very large population of particles (global or regional scales), by virtue of the central limit theorem.
l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
Ans.: As noted by referee #1, as a ratio of sample statistics, SCVF_15 is expected to be noisy. We now plot the IT variance instead.
l253: "log domain" : this figure appears to be on a linear scale?
Ans.: Plot changed.
l270: truely > truly
Ans.: Corrected
Figure 11b: a legend for the various fitted curve would be very helpful.
Ans.: Added legend
l293: Why is it 3 times T_{int}?
Ans.: For 95% of the exponential decay is achieved within 3 time constants (exp(3)~0.05). Added comment.
l306: "Note that ..." : this should be moved earlier just after your eq 6.
Ans.: This remark explains the results of the fitting. We did not assume this when defining our model.
l315319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
Ans.: Expanded on the implications for HYCOM.
Section 4.4:
l323: If your method holds, should you not rather say that the model is biased low?
Ans.: Since we do not see any reasons for the in situ data/processing to be biased high, indeed our conclusion is that HYCOM is biased low.
Data availability: A statement on the HYCOM data availability is missing.
Ans.: Added statement

AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221085', Anonymous Referee #1, 21 Nov 2022
The authors have made an interesting analysis of semidiurnal tides
in the HYCOM model using the timelagged Eulerian and Lagrangian
autocovariances of vertical isotherm displacement. They compared output
from a 32daylong HYCOM simulation with Argo park mode data and
moored thermistor data, and found that the "total internal tide"
variance in HYCOM is too small, especially in the far Southern Ocean.
They also found that the Eulerian and Lagrangian estimates of the
tidal variance at nearzero lag agree very well using HYCOM data, which
bolsters this analysis and their previous analysis of Argo data.
Finally, they used CasparCohen's technique to estimate the "intrinsic"
and "apparent" decorrelation times of the internal tide, and found that
the observed tides decorrelate faster than the HYCOM tides. The authors
make the interesting observation that the mean (stationary) IT in
HYCOM is too large compared to altimetry, but the total IT
(stationary + nonstationary) is too small compared to Argo and moorings.They discussed some reasons for the discrpancies between HYCOM and the
observations, but it was unclear if any of their suggestions could
explain the quantitative differences. With regard to the too stationary
tides in HYCOM, they did not mention the possible roles of missing
smallscale mesoscale or submesoscale variability in HYCOM or the deficit
of highfrequency wind forcing.Overall, this is a nice piece of work which I think will be of interest
to many readers of Ocean Science. I have many small comments, listed
below. While I would say I have no major concerns, my comments could justify
some new analyses or revisions of results presented, so I recommend Major Revision.Comments:
l1: Is "total" needed? Why not omit or say "tidal"?
Throughout the abstract, "total" is used, but it is not contrasted with
"partial" or another "nontotal" quantity to understand what distinction is
implied by "total".l11: "beams" > "waves" or "beams of waves"
l17: Omit "at any given position", since later in the sentence you state that
you are referring specifically to "their generation site".l19: "causes" > "cause"
l23l27: I think I understand what the authors' are getting at, but I found
the first three sentences confusing. When we look at the plot of an autocovariance,
such as is suggested by the first phrase, we would see the envelope of
autocovariance decay, and the coherent fraction of the signal
will dominate the autocovariance at long lag. It seems like this paragraph
may be muddling the ideas of what happens to the autocovariance
as a function of increasing lag, versus what happens to the wave
energy as a function of increasing propagation distance. I would suggest
rethinking the purpose of this paragraph and rewriting it to moreclearly
articulate the point you wish to make.l42: Once again, "total" is used without distinguishing it properly. It seems
like it should be clearly defined above, when the ideas of the coherent and
incoherent signals are defined.l52: Finally, "the autocovariance at short time lags", is identified with the
"total variance". Some sort of explanation needs to be provided earlier.
But how is tidal variability distinguished from noise and highfrequency
ocean variability when looking at the "short time lags"?l52: "On the other hand"  I am not clear what is the other part of the
contrast. Omit this phrase?l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought
to be one which is unaffected by extrinsic factors. But, the decorrelation is
entirely caused by interactions with the propagation medium. Perhaps it is
best to stick with the Eulerian vs Lagrangian distinction, and when the
autocovariance is discussed, it seems like you need to be clear whether
you are discussing an Eulerian or Lagrangian autocorrelation.l63: omit "the strength of"
l73: "can vary" by how much?
l81: Did you use exactly the same datset as in Geoffroy and Nycander (2022)?
I would be interested to know how many 32day records there are, from how
many individual drifters. Also, can you remind us exactly what the "data"
consist of? Is it time series of isopycnal displacement, inferred from
temperature measurements during the part phase, using temperature profiles from
the start and end?l111: Is dT/dP in the numerator the same as dTbar/dP?
l114: "obtained" > "estimated"
l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't
this a biased estimator when the expected value is taken for fixed N?l140: I do not understand why the sine component is included. The autocovariance
is an even function, so any projection onto the sine must be noise, right?
Likewise, I don't understand the total error defined in equation (4). And
why would a robust estimator (median for \tilde{SEM}) be combined with a
nonrobust estimator (Var A)?l157: "not significantly different"  Well, I agree that they do fall within
each others' standard errors, but they look significantly different to me.
What is the probability of the offset over so many different lags; how many
d.o.f. do you think are in these estimates?l170: Can you explain why you estimated the Eulerian autocovariance along
the Lagrangian trajectory? Are you trying to account for the geographic
variability of the Eulerian autocovariance?Fig 3: This is related to the above question: Why are the Eulerian error
bars so small compared to the Lagrangian? Maybe you could spend a little
more time explaining how this plot relates to Fig 2. Are the Lagrangian
HYCOM curves in Fig 2 identical to those in Fig 3?Fig 4: Maybe use the same color for the HYCOM curves in each plot?
Is the red curve in Fig 3b the same as the black curve in Fig 4b? They
seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance,
but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b,
but R(740hr) > 10m^2 in Fig 4b).l190: "for each particle" > "for each HYCOM particle" ? But if this paragraph
applies to HYCOM, how can there be outliers?l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory?
This would seem to use so many more degrees of freedom compared to the Lagrangian
estimates. I am not sure why this is done or why it would be justified.Paragraph at line 195: This is a very good comparison and a little
surprising, to me.Fig 6: Why are the maps drawn so small?
Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
l225: Would it be fair to guess that there are also many Argo trajectories that
were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff
criterion? I wonder if you would see the difference in HYCOM vs Argo if
you made a Fig 8 based on the drift speed?l227: Once again, "intrinsic" does not seem to be the right word.
l235: Is this an expected property of the Rayleigh distribution for the
pdf of the modulated wave amplitude? You might want to look into this in
the acoustics or optics literature. I don't believe this has been observed
previously for narrowband ocean internal waves.l247: Why are you comparing the SCVF_15 statistic? This is a ratio of
sample statistics and likely to be very noisy. I don't really know what to
make of Fig 9a. With such a small dataset, I would like to see the ratio of
total variance (the demodulate amplitude at tau=48hr), instead.l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the
demodulate amplitude at 48hr lag. Why not use that same quantity for
comparison? Oh  I see it in Table 1.l262: I am not sure what the "discrepancy" refers to.
l265: SCVF^{15} > SCVF_{15}
l273: I don't think "no impact" is the correct way to characterize the previous
results. There is considerable scatter in Fig 5, and Fig 3b shows that the
estimates differ. Also, it is unclear to me why you don't try to make the
estimates more consistent by extrapolating the demodulate amplitude to zero lag.l275: "mecanism" > "mechanism"
l276: Why not just call it "Lagrangian decorrelation" instead of "apparent
decorrelation"? If I had been a reviewer on GasparCohen, I would have made the
same suggestion.l306: "sinusoide" > "sinusoid"
Table 2: It is interesting that the \omega_{AM} frequency corresponds to
M2S2 beating, but the amplitude (\sigma^2_{AM}) does not.l330358: Modes discussion.
l364377: Bathymetry discussion. Surely the importance of the errors depends
on the horizontal spatial scale of the errors. While this is interesting
discussion, it ought to consider the wavenumber spectrum of the error.l386401: Stratification.
None of these discussions really deal with the overall quantitative
difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or
0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems
in terms of their spatial coverage, but the Argo comparison seems much more
meaningful. I am unclear which of the authors' proposed sources of
bias could account for the 26% deficit compared to Argo.l403: "run with" > "run of"
l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of
the 0.74 value at l216? Have I misunderstood this?l422: "stationnary" > "stationary"; also, I think the "big O" notation should
be reserved for asymptotics, and here it is better to say "about" or
"approximately". Finally, I don't think "becomes stationary" is an appropriate
descriptor; this would be like saying a time series "becomes its mean".l429: "unaffected" > "unaffected in the mean"
l339: "supposedly account" > "supposedly accounts"
l452: latex formatting needs help in the URL.
l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html
References: inconsistent capitalization is used in article titles
l557: "and contributors, T. P."
Citation: https://doi.org/10.5194/egusphere20221085RC1 
AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023
We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
General comments:
The authors have made an interesting analysis of semidiurnal tides in the HYCOM model using the timelagged Eulerian and Lagrangian autocovariances of vertical isotherm displacement. They compared output from a 32daylong HYCOM simulation with Argo park mode data and moored thermistor data, and found that the "total internal tide" variance in HYCOM is too small, especially in the far Southern Ocean. They also found that the Eulerian and Lagrangian estimates of the tidal variance at nearzero lag agree very well using HYCOM data, which bolsters this analysis and their previous analysis of Argo data. Finally, they used CasparCohen's technique to estimate the "intrinsic" and "apparent" decorrelation times of the internal tide, and found that the observed tides decorrelate faster than the HYCOM tides. The authors make the interesting observation that the mean (stationary) IT in HYCOM is too large compared to altimetry, but the total IT (stationary + nonstationary) is too small compared to Argo and moorings.
They discussed some reasons for the discrepancies between HYCOM and the observations, but it was unclear if any of their suggestions could explain the quantitative differences. With regard to the too stationary tides in HYCOM, they did not mention the possible roles of missing smallscale mesoscale or submesoscale variability in HYCOM or the deficit of highfrequency wind forcing.
Overall, this is a nice piece of work which I think will be of interest to many readers of Ocean Science. I have many small comments, listed below. While I would say I have no major concerns, my comments could justify some new analyses or revisions of results presented, so I recommend Major Revision.
Comments:
l1: Is "total" needed? Why not omit or say "tidal"? Throughout the abstract, "total" is used, but it is not contrasted with "partial" or another "nontotal" quantity to understand what distinction is implied by "total".
Ans.: Reworked abstract. Suppressed “total”.
l11: "beams" > "waves" or "beams of waves"
Ans.: Corrected
l17: Omit "at any given position", since later in the sentence you state that you are referring specifically to "their generation site".
Ans.: The sentence is correct: the phase difference accounts for the propagation of the waves from the generation site to any given position, but it is constant in time.
l19: "causes" > "cause"
Ans.: Corrected
l23l27: I think I understand what the authors' are getting at, but I found the first three sentences confusing. When we look at the plot of an autocovariance, such as is suggested by the first phrase, we would see the envelope of autocovariance decay, and the coherent fraction of the signal will dominate the autocovariance at long lag. It seems like this paragraph may be muddling the ideas of what happens to the autocovariance as a function of increasing lag, versus what happens to the wave energy as a function of increasing propagation distance. I would suggest rethinking the purpose of this paragraph and rewriting it to moreclearly articulate the point you wish to make.
Ans.: Reworked paragraph, recentering it on the stationary/nonstationary wave field.
l42: Once again, "total" is used without distinguishing it properly. It seems like it should be clearly defined above, when the ideas of the coherent and incoherent signals are defined.
Ans.: “total” was defined earlier in the text (l.25). Added italic font.
l52: Finally, "the autocovariance at short time lags", is identified with the "total variance". Some sort of explanation needs to be provided earlier. But how is tidal variability distinguished from noise and highfrequency ocean variability when looking at the "short time lags"?
Ans.: “total” was defined earlier in the text (l.25). Provided short explanation on how noise is filtered out.
l52: "On the other hand"  I am not clear what is the other part of the contrast. Omit this phrase?
Ans.: Contrasts the Lagrangian decorrelation downside with the fine time resolution upside.
l55: I am not sure "intrinsic" is the right word. An intrinsic quality ought to be one which is unaffected by extrinsic factors. But, the decorrelation is entirely caused by interactions with the propagation medium. Perhaps it is best to stick with the Eulerian vs Lagrangian distinction, and when the autocovariance is discussed, it seems like you need to be clear whether you are discussing an Eulerian or Lagrangian autocorrelation.
Ans.: Agreed. Replaced "intrinsic decorrelation" by "decorrelation" or "decorrelation of the IT", and further "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
l63: omit "the strength of"
Ans.: Corrected
l73: "can vary" by how much?
Ans.: Replaced by “The sampling period of the park phase can occasionally vary by more than a few seconds.” The vast majority of the park phases we use have a sampling period of 1h. Very rarely do park phases have sampling periods significantly shorter (and even more rarely longer) than one hour.
l81: Did you use exactly the same dataset as in Geoffroy and Nycander (2022)? I would be interested to know how many 32day records there are, from how many individual drifters. Also, can you remind us exactly what the "data" consist of? Is it time series of isopycnal displacement, inferred from temperature measurements during the park phase, using temperature profiles from the start and end?
Ans.: Added information at the beginning of section 2, and in section 2 and 3 (isotherms displacement is properly defined in section 3).
l111: Is dT/dP in the numerator the same as dTbar/dP?
Ans.: No: Tbar in the numerator is not a function of z, and they are calculated independently.
l114: "obtained" > "estimated"
Ans.: Corrected
l123: Sorry if I misunderstand what is meant by "unbiased" here, but isn't this a biased estimator when the expected value is taken for fixed N?
Ans.: This estimator is unbiased: for increasingly large N, the estimator converges to the true value.
l140: I do not understand why the sine component is included. The autocovariance is an even function, so any projection onto the sine must be noise, right ? [...]
Ans.: Indeed, the acov is an even function. However, when two tidal constituents close in frequency are present, the autocovariance of the resulting beating can be expressed in terms of a cosine and a sine component (short derivation attached)
[…] Likewise, I don't understand the total error defined in equation (4). And why would a robust estimator (median for \tilde{SEM}) be combined with a nonrobust estimator (Var A)?
Ans.: The definition of the total error in equation (4) was not rigorous. Referee #2 also pointed at the wrong assumption of Gaussian statistics when computing the confidence interval of the complex demodulate. For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulate.
l157: "not significantly different"  Well, I agree that they do fall within each others' standard errors, but they look significantly different to me. What is the probability of the offset over so many different lags; how many d.o.f. do you think are in these estimates?
Ans: Modified the sentence: “Apart from the first couple of demodulates, the HYCOM demodulate series appears consistently smaller than the Argo one.“ The number of d.o.f. is not straightforward to estimate, mainly because one first has to estimate how many independent values our 767h records contain. Eq. (24) in Awe’s 1964 paper “Errors in correlation between time series” gives a way to compute such an estimate. However, since we do not know the underlying true autocovariance function, we cannot precisely evaluate this. Using the local mean autocovariance computed from Argo data for the local example shown in Fig. 2, we estimate that values separated by L~12 h may be considered independent of one another. Hence, when computing the mean autocovariance at a given \tau, we have roughly N_p*(N\tau)/L d.o.f., with the number of 32day records N_p=8 and the number of values in each record N=767. For small \tau, we have approximately 8*767/12 ~ 500 d.o.f. For the corresponding Lagrangian data from HYCOM we get ~250 d.o.f. (here L~39 h). Thus, they have largely enough d.o.f. to be considered different. The new confidence interval estimates reflect that conclusion.
l170: Can you explain why you estimated the Eulerian autocovariance along the Lagrangian trajectory? Are you trying to account for the geographic variability of the Eulerian autocovariance? Ans.: Precisely. That way we are not introducing any discrepancy due to the (random) Lagrangian spatial sampling. Added explanations in the text.
Fig 3: This is related to the above question: Why are the Eulerian errorbars so small compared to the Lagrangian? Maybe you could spend a little more time explaining how this plot relates to Fig 2. Are the Lagrangian HYCOM curves in Fig 2 identical to those in Fig 3?
Ans.: The Eulerian mean autocovariance uses many more d.o.f. There are about 60 times more 32day segments that are used to compute the Eulerian mean autocovariance resulting in about 4600 d.o.f. (L is also larger, i.e. the data less independent). Added explanations in the main text. Added in the caption of Fig. 3 that the Lagrangian HYCOM curves in Fig. 2 and 3 are identical.
Fig 4: Maybe use the same color for the HYCOM curves in each plot? Is the red curve in Fig 3b the same as the black curve in Fig 4b? They seem to both be labelled as demodulates of the HYCOM Eulerian autocovariance, but they seem to have different numeric values (R(740hr) < 10m^2 in Fig 3b, but R(740hr) > 10m^2 in Fig 4b).
Ans.: The color choice we made is to have the HYCOM Lagrangian data plotted in black, and the in situ data in red whenever possible. The red curve in Fig 3b should indeed be the same as the black curve in Fig 4b (Added text in the caption of Fig.4). There was an error in the plotting script.
l190: "for each particle" > "for each HYCOM particle" ? But if this paragraph applies to HYCOM, how can there be outliers?
Ans.: Replaced "for each particle" by "for each HYCOM particle". \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added explanations in the text.
l193: Are the HYCOM Eulerian autocovariances computed all along the trajectory? This would seem to use so many more degrees of freedom compared to the Lagrangian estimates. I am not sure why this is done or why it would be justified. Paragraph at line 195: This is a very good comparison and a little surprising, to me.
Ans.: Yes, the HYCOM Eulerian autocovariances are computed all along the trajectory subsampled every 12 h. As written before, the HYCOM Eulerian autocovariance estimates do use many more degrees of freedom. As a result the formal error is much smaller.
Fig 6: Why are the maps drawn so small?
Fig 7 + 9 + 13 + 14: Please enlarge the maps and panels.
Ans.: Enlarged Fig. 6, 7, 8, 9, 13, 14.
l215: The "fairly constant" ratio is not apparent to me in Fig 7c. Should it be?
Ans.: The ratio itself is not plotted (added as a comment in the main text), however we write its mean value and standard deviation in the text.
l225: Would it be fair to guess that there are also many Argo trajectories that were excluded in the Southern Ocean due to the 0.1m/s drift speed cutoff criterion? I wonder if you would see the difference in HYCOM vs Argo if you made a Fig 8 based on the drift speed?
Ans.: This criterion removes roughly 3000 Argo segments (~15%), but the final maps only get ~1% less coverage. This mostly affects the Argo data density in the regions east of Drake's passage, east of Algulas, and in the Equatorial Pacific. Attached is a figure similar to Fig. 8 but based on the mean drift speed: the Argo mean autocovariance remains more or less the same while the Lagrangian HYCOM autocovariance is significantly smaller for drift speeds smaller than 0.033 m s1. Coincidentally, the HYCOM bins with a mean speed smaller than 0.033 m s1 are mostly poleward of 50 deg S.
l227: Once again, "intrinsic" does not seem to be the right word.
Ans.: Deleted "intrinsic".
l235: Is this an expected property of the Rayleigh distribution for the pdf of the modulated wave amplitude? You might want to look into this in the acoustics or optics literature. I don't believe this has been observed previously for narrowband ocean internal waves.
Ans.: As pointed at by referee #2, the complex demodulates are not Gaussian distributed. Following, our definition of the complex demodulate, if the fitted C and S are Gaussian random variables with a same standard deviation, then A = sqrt(C^2+S^2) follows a Rice distribution. We did not give more thought about the expected distribution of the local mean autocovariance.
l247: Why are you comparing the SCVF_15 statistic? This is a ratio of sample statistics and likely to be very noisy. I don't really know what to make of Fig 9a. With such a small dataset, I would like to see the ratio of total variance (the demodulate amplitude at tau=48hr), instead.
Ans.: Indeed the SCVF_15 statistic from local mean autocovariance was very noisy. We now only compute it from larger populations of sample autocovariances. Fig 9 is now showing the variance instead of SCVF_15.
l260: Previously (in the Argo vs HYCOM comparison) you used the ratio of the demodulate amplitude at 48hr lag. Why not use that same quantity for comparison? Oh  I see it in Table 1.
Ans.: 
l262: I am not sure what the "discrepancy" refers to.
Ans.: Poor writing, rewrote the sentence.
l265: SCVF^{15} > SCVF_{15}
Ans.: Corrected
l273: I don't think "no impact" is the correct way to characterize the previous results. There is considerable scatter in Fig 5, and Fig 3b shows that the estimates differ. Also, it is unclear to me why you don't try to make the estimates more consistent by extrapolating the demodulate amplitude to zero lag.
Ans.: Replaced “no impact” by “no significant impact”. We do not know what is the true autocovariance function and how it behaves close to 0 time lag. The first demodulate is a conservative, simple, and robust estimate of the IT variance. It is also consistent within this work.
l275: "mecanism" > "mechanism"
Ans.: Corrected
l276: Why not just call it "Lagrangian decorrelation" instead of "apparent decorrelation"? If I had been a reviewer on GasparCohen, I would have made the same suggestion.
Ans.: Replaced "apparent decorrelation" by "Lagrangian decorrelation", everywhere.
l306: "sinusoide" > "sinusoid"
Ans.: Corrected
Table 2: It is interesting that the \omega_{AM} frequency corresponds to M2S2 beating, but the amplitude (\sigma^2_{AM}) does not.
Ans.: The M2S2 beating, although probably dominating the semidiurnal amplitude modulation, is definitely not the only contribution to this amplitude modulation (other constituents close to M2 play a role).
l330358: Modes discussion.
Ans.: 
l364377: Bathymetry discussion. Surely the importance of the errors depends on the horizontal spatial scale of the errors. While this is interesting discussion, it ought to consider the wavenumber spectrum of the error.
Ans.: We do not have quantitative information on this error spectrum.
l386401: Stratification. None of these discussions really deal with the overall quantitative difference of HYCOM vs obs which is about 0.74 (HYCOM/Argo) or 0.51 (HYCOM/Mooring) equatorward of 50 deg. Both datasets have problems in terms of their spatial coverage, but the Argo comparison seems much more meaningful. I am unclear which of the authors' proposed sources of bias could account for the 26% deficit compared to Argo.
Ans.: Added summary sentence at the end of Sect. 4.
l403: "run with" > "run of"
Ans.: Corrected
l416: Shouldn't the factor of 1.5 mentioned here equal the reciprocal of the 0.74 value at l216? Have I misunderstood this?
Ans.: This factor of 1.5 was already mentioned l.225. Contrarily to the 0.74 value at l.216, it does include latitudes north of 50 deg N. Rewrote with factors consistent throughout the section. Added table 1 to summarize the statistics for the different groups.
l422: "stationnary" > "stationary"; also, I think the "big O" notation should be reserved for asymptotics, and here it is better to say "about" or "approximately". Finally, I don't think "becomes stationary" is an appropriate descriptor; this would be like saying a time series "becomes its mean".
Ans.: Corrected. Replaced “the IT becomes stationary” by “the IT autocovariance reaches its stationary limit“.
l429: "unaffected" > "unaffected in the mean"
Ans.: Corrected
l339: "supposedly account" > "supposedly accounts"
Ans.: Corrected
l452: latex formatting needs help in the URL.
Ans.: Corrected
l454: Zaron's URL has changed to https://ingria.ceoas.oregonstate.edu/~zarone/downloads.html References: inconsistent capitalization is used in article titles
Ans.: Corrected
l557: "and contributors, T. P."
Ans.: Corrected

AC1: 'Reply on RC1', Gaspard Geoffroy, 24 Jan 2023

RC2: 'Comment on egusphere20221085', Anonymous Referee #2, 07 Dec 2022
General comments:
This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning? I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
 comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
 comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
 Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
Specific comments and technical corrections:Abstract: line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "autocorrelation" or "autocovariance" which are established statistical terms. An abstract should be able to stand alone.
It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
l46: It is not obvious (to me) what the kspace methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
l7172: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
Section 2.3: Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
l101: Why 41644? Does this correspond to a mean geographical density?
l106: The effects of the drift? Do you mean potential Lagrangian biases?
Section 3.1; eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
l113115: which monthlymean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chisquare variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
l190: "outliers": please use sentence to explain what you mean.
l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
l198: "taken as ..." : state this earlier to remind the reader.
l208: a bias which means that HYCOM underestimate Argo, correct?
Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
l214: The ratio increases approaching the poles? Where is this seen?
l225: Should you conclude the section with some statement?
Section 4.2:
l227228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
l229230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
l253: "log domain" : this figure appears to be on a linear scale?
l270: truely > truly
Figure 11b: a legend for the various fitted curve would be very helpful.
l293: Why is it 3 times T_{int}?
l306: "Note that ..." : this should be moved earlier just after your eq 6.
l 315319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
Section 4.4:
ll323: If your method holds, should you not rather say that the model is biased low?
Data availability: A statement on the HYCOM data availability is missing.
Citation: https://doi.org/10.5194/egusphere20221085RC2 
AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
We would like first to thank both referees for their valuable inputs. Many of their remarks proved pertinent, and overall contributed to make the manuscript better.
General comments:
This is a very valuable and comprehensive study that compares the semidiurnal tidal variance and autocovariance in a global numerical model and in in situ observations. This paper is well written and mostly logically organized. I would recommend publication after the main concern below is addressed as well as the smaller technical comments.
I have some serious concerns about the method employed to compare autocovariance function estimates. After calculating average autocovariance functions, the authors essentially estimate the autocovariance function amplitudes and their associated confidence intervals. Yet, the method employed for calculating these confidence intervals appear to be flawed because it assumes that the amplitude estimates are normally distributed, which is not the case. This is clearly illustrated as an example in Figure 2b that shows confidence intervals crossing the zero value: true amplitude values cannot be less than zero. I suggest for the authors to properly derive error estimates and confidence intervals for the autocovariance amplitude before pursuing the rest of this study. I provide some potential ways of doing so in my detailed comments below. In fact, in Figure 10, as an example, the authors take a better approach by displaying quantiles of the distributions: why not taking that approach from the beginning?
Ans.: We wrongly assumed Gaussian distributed complex demodulates, hence our confidence intervals were incorrect. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
I also suggest to replace the term "demodulate" by something more meaningful: perhaps autocovariance envelope or amplitude? As noted below, the method to obtain the "complex demodulate" could be improved by simply computing the analytic transform of the autocovariance functions.
Ans.: From our understanding, the analytic transform would not capture the amplitude at the M2 frequency without bandpass filtering the \eta_1000 time series a priori.
For the validating part of the study, section 3, the current organization of the material does not make sense to me and the conclusions are not clearly laid out. First, I would like to know how the model does in an Eulerian framework, then I would like to know if the Lagrangian framework or method is valid, and third I would like to know the result of comparing Argo and Lagrangian HYCOM particles. As such, I suggest the following reorganization of the material of section 3:
 comparison of HYCOM Eulerian results and mooring (Eulerian) results to assess the model: what is the conclusion?
 comparison of HYCOM Eulerian and Lagrangian results to assess the method of using Lagragian data: what is the conclusion on the potential Lagrangian bias?
 Comparison of HYCOM Lagrangian results and Argo (Lagrangian) results: what is the conclusion?
Ans.: Section 3 in itself is not about validating HYCOM, but introducing, using a local example, the methods used to further validate HYCOM (section 4). There are other papers describing strict Eulerian pointtopoint comparisons between HYCOM and moorings (c.f., cited papers Ansong, 2017 and Luecke, 2020). Rather, the Eulerian component of our analysis is designed to bolster and extend the main Lagrangian component (added clarification at the end of section 1). Therefore, the logical beginning of section 3 is the methodology developed by Geoffroy and Nycander (2022) to estimate the variance of the semidiurnal IT using Lagrangian data. We then developed a Eulerian framework using the HYCOM data primarily to validate our Lagrangian methodology. Incidentally, it also enables the analysis of the decorrelation of the IT. This mirrors the organization of section 4.
Comments: Abstract:
line 1: In the abstract, unless you explain there what you mean by "decorrelate" as you do in the main text, I think that instead of "correlation" you should write "autocorrelation" or "autocovariance" which are established statistical terms. An abstract should be able to stand alone. It may already be in the title but perhaps you could rephrase the abstract to provide a summary statement of what you are doing: validating a model by comparing it to in situ observations.
Ans.: Reworked abstract. Suppressed “decorrelate”.
l46: It is not obvious (to me) what the kspace methodology is. I suggest that you rephrase or explain. Does this refer to the method of Zaron (2017)?
Ans.: Yes. Rephrased.
l7172: It is not good practice to refer to a section ahead. Simply explain that the duration was chosen to match the numerical output you are using/comparing?
Ans.: Corrected
Section 2.3:
Perhaps a reference for HYCOM and that specific simulation is needed. Should you add more details about the use of the Parcels software that would allow readers to replicate your experiment?
Ans.: The reference for HYCOM (Chassignet et al., 2006) was given in the introduction, the first time we used the HYCOM acronym. We do not have any other reference for this particular simulation (apart from `GLBy190.04'). Added information on the Lagrangian simulation.
l96: Why "mainly"? And can you simply state why you used only 32 days of the model? Data/space constrains?
Ans.: Suppressed “mainly”. Added paragraph at the beginning of the section regarding the data we use. The model wasn’t run specifically for this study, we used the data that were available and suitable for our methodology. Hence, there are no real technical constraints to mention.
l101: Why 41644? Does this correspond to a mean geographical density?
Ans.: Yes, it roughly corresponds to a mean density of 15 particles in our final 200 km radius circular patches. We feel this is unnecessary to be added, moreover it would not be easy to motivate clearly at this stage of the paper.
l106: The effects of the drift? Do you mean potential Lagrangian biases?
Ans.: As pointed at by referee #1, we prefer to call these effects “Lagrangian decorrelation”.
Section 3.1:
eq. 1: Could we get here an explanation of this quantity and why it represents the vertical displacement of an isotherm? Perhaps cite Hennon et al. 2014 as you did in Geoffroy and Nycander 2022? Are you correcting for the float displacement as you did in that paper?
Ans.: Added explanations. We do correct for the float displacement for the Argo data .
l113115: which monthlymean 3D temperature field? Is it from a product for the case of Argo? Please provide more details; I do not understand how you get that gradient for the in situ data.
Ans.: Added information. It is the modeled monthlymean 3D temperature field introduced in section 2.3. For the Argo data, we compute the temperature gradient at 1000 dbar for a given park phase using the temperature profiles recorded by the float immediately before and after that park phase (now made clearer in section 3).
Figure 1: The Argo segments are shown as dots? How are these segments? Could you plot the assumed rectilinear trajectories of the Argo floats?
Ans.: Figure 1 shows the median position of the Argo segments as dots (now made clear in the text). Added Argo trajectories.
l 131: I don't get this: what is a "binned HYCOM particle"? Do you mean that you average the individual autocovariance estimates in Eulerian bins?
Ans.: Rephrased.
l136: Don't you think that in that figure the R_{argo} falls below its CI at ~100h rather than at ~200h?
Ans.: Here R_{argo} is the red curve. We do see it fall below its CI at ~200h.
Eq4 and after: I am not sure that this is the right way to compute the confidence interval for A: what you call the complex demodulate, or rather its square value (A^2), should be distributed like a chisquare variable with 2 degrees of freedom (like a spectral estimate), and not distributed like a Gaussian variable. Thus, confidence intervals as plus or minus two standard errors are likely incorrect. Consider your figure 2b: the CIs suggest that A can take negative values whereas it is clearly a positive quantity. I suggest you revise the derivation of the CI for A and reassess your overall results.
Ans.: Correct. Referee #1 also pointed at the questionable definition of the total error in equation (4). For both these reasons, we now use a Monte Carlo method to estimate the confidence interval of the complex demodulates.
Figure 2b: the CIs for the two curves are superimposed and thus cannot be distinguished; please modify the figure so that the reader can see both.
Ans.: Figure modified.
l156: Considering my remark above that your CIs are likely incorrect, I think you should revisit that statement.
Ans.: Also discussed by referee #1, modified the sentence.
I believe that what you are trying to plot in Figure 2b is the envelope of the autocovariance function. Your method is probably fine but the envelope can be easily obtained by computing the amplitude of the analytic signal of the autocovariance, see Lilly and Gascard (2006) as an example (The analytic signal can be calculated using the Hilbert transform in python with the scipy package or the anatrans.m function of the jLab toolbox for Matlab). One way to get a confidence interval for the amplitude of the analytic transform would be to look at the distribution of all the individual transform amplitudes, lag value by lag value (as you do in Figure 10 later).
Ans.: Our complex demodulate method specifically selects the amplitude at the semidiurnal frequency. Our understanding is that unless bandpass filtering the time series prior to computing the autocovariance, the analytic transform cannot be used to isolate the amplitude of the oscillations at the semidiurnal frequency.
l190: "outliers": please use sentence to explain what you mean.
Ans.: Also pointed at by referee #1. Added sentence. \eta_1000 values can be unstable when facing small temperature gradients in the denominator of Eq. (1) and (5), leading to unrealistically large variance. For consistency with the Argo results, we use the same quality checks on the variance of \eta_1000 computed using HYCOM data. Added sentence.
l194: Figure 5 does not look like a scatter plot but a 2D density plot. Is the R^2 exactly 1 as written in the plot or is it approximately 1 as stated in the text? I am surprised that it is so close to one. What is it in a domain that is not logarithmic? What is your R2 anyway? The adjective "Pearson" is usually used for the correlation coefficient while the coefficient of determination is the correlation squared for linear regression.
Ans.: Deleted “scatter”. There was an error in the script calculating R^2. The correct value in log domain is 0.98, in nonlog domain it is 0.74. Our R^2 is (Pearson’s R)^2. Changed to r^2 to avoid any confusion with the autocovariance (denoted R).
l198: "taken as ..." : state this earlier to remind the reader.
Ans.: Added statement earlier in the text.
l208: a bias which means that HYCOM underestimate Argo, correct?
Ans.: Correct. Rephrased.
Figure 7b: try the ratio x/(x+y) instead of log10(x/y) as in Arbic, Elipot et al. 2022. In this way you will not have to use log10 and truncate the scale. The results will look the same but it is a better statistic that is robust to outliers.
Ans.: Replaced log10(x/y) by x/(x+y).
l214: The ratio increases approaching the poles? Where is this seen?
Ans.: The ratio itself is not shown, added comment.
l225: Should you conclude the section with some statement?
Ans.: Added sentence referring to the discussion on potential sources of biases.
Section 4.2:
l227228: I do not understand what you mean by that. Please explain what is the intrinsic decorrelation. Do you mean Eulerian? In fixed space?
Ans.: Also noted by referee #1. Everywhere replaced "intrinsic decorrelation" by "decorrelation of the IT", or "decorrelation", and further "apparent decorrelation" by "Lagrangian decorrelation".
l229230: "particle in the Eulerian framework": what do you mean? You average in Eulerian/geographical bins? I think you should use "Eulerian framework" for computing autocovariance from Eulerian time series (model grid and moorings) and "Lagrangian framework" for computing autocovariance from Lagrangian time series (model particles and Argo).
Ans.: Changed the proposition to “We compute a sample autocovariance in the Eulerian framework for each particle“. This refers to the sample autocovariance computed in our Eulerian framework as explained in section 3.3. The sample autocovariance in the Eulerian framework is computed along the particle’s trajectory using Eulerian data.
l235: yes indeed because the autocovariance and its amplitude are probably not gaussian distributed!
Ans.: Agreed, by definition our demodulates are Rice distributed. Deleted “(and their demodulates)“. However, we emphasize that the sample mean autocovariance at a given time lag can be considered Gaussian distributed in the two following cases:
 In a local geographical patch: we assume the particles are randomly sampling a wave field with uniform statistics.
 When computing the sample mean autocovariance from a very large population of particles (global or regional scales), by virtue of the central limit theorem.
l251: "the distribution is well centered on the y = x": I strongly suggest you revise this assessment. Figure 9a suggests no linear relation between the mooring results and the model results.
Ans.: As noted by referee #1, as a ratio of sample statistics, SCVF_15 is expected to be noisy. We now plot the IT variance instead.
l253: "log domain" : this figure appears to be on a linear scale?
Ans.: Plot changed.
l270: truely > truly
Ans.: Corrected
Figure 11b: a legend for the various fitted curve would be very helpful.
Ans.: Added legend
l293: Why is it 3 times T_{int}?
Ans.: For 95% of the exponential decay is achieved within 3 time constants (exp(3)~0.05). Added comment.
l306: "Note that ..." : this should be moved earlier just after your eq 6.
Ans.: This remark explains the results of the fitting. We did not assume this when defining our model.
l315319: What are the implications of this comparison for HYCOM? Could you expand? I understand you address this next but a transition sentence at the end of a section would be useful.
Ans.: Expanded on the implications for HYCOM.
Section 4.4:
l323: If your method holds, should you not rather say that the model is biased low?
Ans.: Since we do not see any reasons for the in situ data/processing to be biased high, indeed our conclusion is that HYCOM is biased low.
Data availability: A statement on the HYCOM data availability is missing.
Ans.: Added statement

AC2: 'Reply on RC2', Gaspard Geoffroy, 24 Jan 2023
Peer review completion
Journal article(s) based on this preprint
Gaspard Geoffroy et al.
Gaspard Geoffroy et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

286  129  15  430  10  5 
 HTML: 286
 PDF: 129
 XML: 15
 Total: 430
 BibTeX: 10
 EndNote: 5
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(6838 KB)  Metadata XML