the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Assessing the impact of future altimeter constellations in the Met Office global ocean forecasting system
Abstract. Satellite altimeter measurements of Sea Level Anomaly (SLA) are a crucial component of current operational ocean forecasting systems. The launch of the SWOT wide-swath altimeter mission is bringing a step change in our observing capacity with 2-dimensional mesoscale structures now able to be observed over the global ocean. Proposals are now being considered for the make-up of the future altimeter constellation. In this study we use Observing System Simulation Experiments (OSSEs) to compare the impact of additional altimeter observations from two proposed future satellite constellations. We focus on the expected impact on the Met Office operational ocean analysis and forecasting system of assimilating an observation network including either 12 nadir altimeters or 2 wide-swath altimeters.
Here we show that an altimeter constellation of 12 nadir altimeters produces greater reductions in the errors for SSH, surface currents, temperature and salinity fields compared to a constellation of 2 wide-swath altimeters. The impact is greatest in the dynamic Western Boundary Current regions where the nadir altimeters can reduce the SSH RMS error by half, while the wide-swath altimeter only reduces this by one-quarter. A comparison of the spatial scales resolved in daily SSH fields also highlights the superiority of the nadir constellation in our forecasting system. We also highlight the detrimental impact spatially-correlated errors could have on the immediate use of wide-swath altimeter observations. However, we still achieve promising impacts from the assimilation of wide-swath altimetry and work is ongoing to develop improved methods to account for spatially-correlated observation errors within our data assimilation scheme.
- Preprint
(10475 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2024-756', Anonymous Referee #1, 01 May 2024
Review of “Assessing the impact of future altimeter constellations in the Met Office global ocean forecasting system” by R.R. King et al.
General Comments:
This study presents a series of Observing System Simulation Experiments performed to assess the relative benefits of two proposed satellite altimetry approaches. The first involves 12 nadir altimeters (i.e. similar to conventional along-track satellite altimetry; NADIR), whereas the second approach would use two wide-swath satellite altimeters (2WISA) similar to the current SWOT mission launched at the end of 2022.
The manuscript is clearly written and provides a well-justified methodology. Results are assessed using both standard approaches together with more sophisticated time and spatial scale dependent scores. A state of the art high-resolution ocean forecasting system is used for the study thereby providing an excellent assessment of the potential impacts of the two proposed altimeter constellations.
There are several issues I feel that the authors should address prior to publication:
- A critical aspect of the study is how the errors are estimated and applied to synthetic observations and how they are specified in the assimilation system. In particular, the errors applied to observations used for NADIR and 2WISA are poorly described and lack details regarding the amplitudes of perturbations applied. The degree to which the system is constrained to the nature run will depend not only on observational coverage, but also on the observational errors applied. This could have an impact on both the RMS errors as well as the spectral properties and ability to make use of the smaller scales present in the wide swath measurements. There are several specific suggestions provided below on how this aspect could be improved.
- A significant degradation is found for 2WISA experiment in the Northeast Pacific Ocean as compared to the Control (with 2 nadir altimeters). The impact of this degradation is visible in many of the figures, and the authors have clearly done their best to avoid this feature in their interpretation of results (e.g. Fig. 11 that only shows results for the Atlantic Ocean). In the conclusions, the authors claim that this is related to the bias correction method for sla observations and that the impact is isolated in the northeast Pacific, but admit they were unable to explain why this occurs. I can understand that correcting and rerunning the experiments would be a costly and time-consuming effort, however, its difficult to be sure that this issue is not the cause of the reduced benefits found for 2WISA. Especially when the results presented are opposite to similar studies published previously (as noted in the conclusions). This issue is compounded by the fact that very little information is provided to describe the procedure. The authors should provide a clearer justification and description, together with some evidence to support their claim that the impact does not affect results in other regions.
- Finally, I agree with the authors conclusion that the spatial and temporal sampling differences between the 12 nadir and 2 wide swath approaches is likely the primary source of the differences presented. However, it would be helpful to illustrate this aspect more clearly. Fig. 1 presents differences in coverage between 1 day and 7 day windows. However, it would be useful to show how the 21-day repeat coverage of 2WISA affects the assimilation statistics. For example, differences could be shown for a small region (e.g. the size of spatial correlation scales applied?) highlighting the intermittence in 2WISA experiment as compared to NADIR. Are the SSH errors for 2WISA smaller than NADIR following the overpasses and then grow with time? If so, this would demonstrate that the wide-swath data is being correctly assimilated and that the reduced impact is indeed the sampling. On the other hand, if the problem is the observational error specification (or the SLA bias correction procedure), we would see that even following an overpass of a wide-swath altimeter the 2WISA experiment would fail to constrain smaller scales. Additionally, maps showing differences in increments over the Gulf Stream region could also reveal if differences between NADIR and 2WISA are due to the presence of SLA biases requiring constant increments at each cycle to maintain the system close to the nature run as opposed to correcting chaotic turbulence (which then grows between cycles).
Specific Comments:
L20: Technically the satellites measure SSH. SLA is obtained by removing a mean SSH surface. As SSH is referred to later in the paper, it would be better to use consistent language throughout.
L25: The following sentence uses the term “nadir”. Include it here to make it clear what this means.
L39: It would be better to define WiSA when SWOT is first mentioned (or not at all). It is not clear to me the benefit of using an acronym for Wide-swath altimetry and not for nadir altimetry when the two alternative observing system approaches being considered are nearly the same length. Moreover, the use of WiSA is used inconsistently throughout the paper, with “wide-swath altimetry” used at times and WiSA other times. It is also somewhat confusing to have an acronym for WiSA and the name of one the experiments 2WISA. I would propose to remove the WiSA acronym and just say “wide-swath altimetry” and “nadir altimetry” to be clear.
L101: It would be good to note that the sea ice model is also different.
L112: A brief description of how the observation errors are generated should be added along with details concerning the amplitude of the errors and whether spatially-correlated errors are introduced. It would be helpful to note here the different types of error (instrumental, representativeness) and what is being estimated here.
L124: It would be helpful to elaborate on what you mean by “realistic errors”. For the baseline nadir altimeters what amplitude is applied for the error?
L134: Final L3 altimetry products typically used for assimilation are corrected for many of the raw satellite errors (tides, DAC, longwave, wet tropospheric). How do you deal with this?
L134: You mention here that only uncorrelated errors are Karin noise and residual path delay error. However, later on line 344, it is noted that Karin and wet tropospheric errors are used. Also, it would be helpful to provide the amplitude of the perturbations applied to the synthetic observations to simulate errors.
L135: Superobbing to 10km. But nadir is at 6km? Why not make it the same? This choice reduces the along-track resolution of the wide-swath data and could affect the extent to which small scales are constrained.
L138: Is nadir data affected by SWH?
L147: Why mention NEMO version number and not CICE version number? Please add version number for the latter.
L156: Here you mention SSH observations. It would be good to be consistent with the introduction and stick to either SLA or SSH (unless there is a reason to differentiate).
L158: “…all assimilated together, and …”. Remove comma. The rest of the sentence isn’t very clear. It would be good to reword.
L164: It would be helpful to state the length scales used as they have a direct impact on how the altimetric information will affect the model solution.
L175: What high-frequency errors are being referred to here? DAC? A bias should have a time mean, but it is mentioned that high-frequency signals are removed from observations by the simulator. Which is it? Additionally, problems with this bias correction term are used to explain the poor performance of 2WISA experiment in the Northeast Pacific Ocean. As such, it would be appropriate to provide additional detail. Indeed, it would be good to add a comment here regarding the role of this correction in affecting the results later on.
L186: Why not run for a full year? Evaluating over only Jan. to July will create a hemispheric bias with respect to the sea ice cover (see comments regarding Fig. 3).
L190: Why does the 2WISA experiment not include Sentinel 3a/b? It would allow us to see the impact of adding the 2 wide swath altimeters directly. As it stands, a comparison of 2WISA and Control would include the impact of the 2 wide-swath altimeters, but also the impact of removing 2 nadir altimeters (Sentinel 3a/b). It seems the removal of these two altimeters has an important impact on the results. As such, an additional simulation with only 1 altimeter (Sentinel6) would provide a means to separate the effects (or rather, with 2 wide-swath altimeters and sentinel3a/b).
L205: It is not clear to me why the authors are comparing min/max values using innovations from the operational system to min/max values using the full grid in the OSSE. Why not assess min/max values from the OSSE innovations? This way the sampling would be the same.
L209: It is standard practice to produce an OSE prior to the OSSE to verify the OSSE framework provides an equivalent response (e.g. when withholding altimetry data). Has an OSE been performed? For example, it would be useful to see the impact of withholding sentinel 3a/b in an OSE and in the OSSE (see comment regarding line 190). This may help to explain some of the areas of degradation seen in the 2WISA experiment.
L220: It would be helpful to know how the total number of observations differs between NADIR and 2WISA experiments. Fig. 1 gives a qualitative sense to this, but total number of observations would give an idea how well the system is able to benefit from the information. Also, the choice to apply a “superobbing” of the data to a 10km grid will affect this number.
L220 (Fig. 2): It would be helpful to provide statistics for other regions, especially the Gulf Stream region since this is the focus of Fig. 4. The global statistics will be strongly affected by the strong signal found under ice (see comment below) thereby biasing the overall (ice-free) results. The global results are also affected the problem in the northeast Pacific.
L223: (Fig. 3): Why only show monthly mean for July? Given that the statistics are quite stationary (apart from initial few analyses), using fields for the full simulation would provide more robust statistics.
L225: Why is there an impact under the sea ice? Were synthetic ssh data used even where there is sea ice? Since this is not typically done in the operational systems it should be rejected in the OSSE as well. In July there should be considerable sea ice cover in the southern ocean. As a result, a change in the SSH data sets assimilated shouldn’t have an impact under the ice. If the study has used data under the sea ice this needs to be mentioned and the impact of this choice discussed in detail as it appears to have a first order impact on the study results.
L226: The degradation in the northeast Pacific is quite unusual. The cause for this feature should be mentioned here as it is not associated with the wide-swath data themselves but rather a suggested problem in the SLA bias correction. Without this explanation, it suggests there is a problem in the experimental setup and undermines the reliability of the findings.
L235: It would be helpful to have a timeseries of RMS error over a small region to illustrate the point being made here about the sporadic sampling in time. We should see smaller errors following assimilation which grow between overpasses. It would also help to demonstrate the WISA data are being assimilated correctly and that it is indeed the time sampling the issue.
L238: “The assimilation of 12 nadir altimeters…”. In fact, the NADIR experiment assimilates 13 nadir altimeters doesn’t it? (i.e. 12 + Sentinel6). It is mentioned the Control has three (2+Sentinel6), so a consistent nomenclature should be used.
L244 (Fig. 5): Why do both experiments show a strong (20%) degradation in temperature below 1000m in the Gulf Stream region?
L258: Surface currents will be strongly affected by the winds applied. Would it not be more interesting to assess just below the surface (e.g. 15m depth) where a large impact on geostrophic currents should be visible? Use of 15m depth would also allow greater potential transferability of results to actual comparisons with drogued drifters.
L276 (Fig. 8): A significant difference between the experiments is in the representation of a large feature in the top right of the panels (these panels should have lat/lon labels). If we consider the panels a 7x7 grid, the feature would be in X=2:3, Y=5. The NADIR run captures well this large feature whereas the other runs look more diffuse (possibly due to higher variability?). This feature appears to be part of the Gulf Stream mean flow. This suggests that its representation may be part of an improvement in bias, rather than having to do with finer resolution.
L283. Should read “It is also clear …”
L291: Over what region are the PSD scores calculated? Could you define ‘Gulf Stream region’ more precisely as this is used loosely throughout the paper. Does this region include land (e.g. the ‘Gulf Stream region’ shown in Fig. 4 or the one used in Fig. 8). If not and a smaller domain with only ocean points is chosen, how are PSD scores near 10deg obtained as they would have very few cycles.
L303: Ballarotta et al. (2019) aim to assess the effective spatial resolution of their procedure to re-grid altimetry observations. The use of this technique here has a somewhat different connotation. The scale at which the PSD of the error is larger than ½ of PSD of the observations would be more appropriately referred to as the limit of constrained scales. It is correct to note that Ballarotta et al. refer to this as “effective resolution”, but it should not be interpreted as so here. For example, there are areas for which the data assimilation system is having trouble assimilating the altimetry observations resulting in larger errors (e.g. in the northeast Pacific Ocean and in the Arctic around 180deg). In the latter, Fig. 10 suggests an “effective resolution” of over 500km! It would be more accurate to refer to this as the limit of constrained scales. In this case, it would highlight that the assimilation system isn’t constraining the SSH to any measurably degree in this region of the Arctic.
L312: ‘…at each point’. If there are a series of 10x10deg boxes every 1deg, then the corresponding PSDs and ratios are considered to correspond to the center of the 10x10deg box. Is this what ‘at each point’ refers to? The center of the 10x10deg box? Perhaps reword to make this more clear.
L312: ‘…where the ratio described above was 0.5’. Reword to something like this ‘…where the ratio described above is equal to 0.5’, or ‘crossed the threshold of 0.5’.
L313: The anomaly in the northeast Pacific is glaringly obvious in Fig. 10, yet there is no comment. This unexplained issue puts in question the validity of the rest of the study.
L351: For these three experiments is the only change the introduction of different perturbations to the synthetic data? Is any change made to the data assimilation system to account for correlated errors? Is the observation error variance changed in the assimilation system? Also, it would be helpful to provide details regarding the amplitude of the correlated errors and how they were determined. Moreover, while its noted that efforts are underway to develop corrections to these correlated errors its not clear the extent to which some correction is used here. Without any correction, the errors would be extremely high (e.g. greater than 50 cm). While the correlated errors may not be completely removed, a reasonable approximation would be to use assume some correction of the correlated errors. So what is used here and what is this estimate based on?
L400 “…showed an overall…”. Typo.
L404: “Although we were unable to find why this affected only the 2WISA experiment and not the NADIR, we found that the effect was localised to the region of degration in the northern Pacific and did not affect the impact of assimilation in other regions, nor the overall order of the impact or our conclusions”. I find this issue somewhat troubling. If the authors were unable to identify the cause of the problem, how can they be sure it is localized in the north Pacific? As noted above, leaving it to the conclusions to offer an explanation for this issue undermines the reliability of the results. It would be better to note earlier in the text that this issue is present in the results and efforts have been made to provide an assessment of the impact of the satellite constellations that are not affected by this issue.
L414: The main difference is not necessarily the assimilation window, but rather the observations used in the analysis. The system used in Benkiran et al. (2024) produces daily analyses using observations from days in the past and the future, similar to a Kalman smoother approach. Could observations from days before and following the analysis be included in the FOAM system in a similar manner? It would be helpful to clarify the text on this point.
L420: The issue of correlated errors is an important one, but various aspects of how it is approached here require further clarification. The precise values of errors applied are not described nor is how the assimilation system is modified in consequence.
L448: Craig Donlon appears as a co-author, but his contribution is not indicated.
Citation: https://doi.org/10.5194/egusphere-2024-756-RC1 - AC1: 'Reply on RC1', Robert King, 02 Jul 2024
-
RC2: 'Comment on egusphere-2024-756', Anonymous Referee #2, 17 May 2024
=== General comment
This paper conducts OSSEs with a 3D-VAR-based eddy-resolving system to compare the impacts of 12 Nadir and 2 WiSA satellites on accuracy. While OSSEs are useful for evaluating various yet-to-be-constructed observation networks, this study is limited to only three experiments: assimilating standard observations, standard observations plus 12 Nadir satellites, and standard observations plus 2 WiSA satellites. To comprehensively determine the most efficient observation networks, more diverse experiments are necessary. Additionally, it would be beneficial to include information on observation coverage and funding considerations for constructing these networks.
Moreover, the manuscript uses colloquial expressions, lacks fundamental details about the data assimilation systems, and does not employ statistical tests. These elements are essential for a scientific paper. Therefore, I conclude that the current paper does not meet the criteria for proceeding to the review process and expect significant revision in the next manuscript.
=== Specific comment
Even in the abstract, there are grammatical errors (e.g., "now able to" in L4 and "greatest" in L10) and unclear abbreviations (e.g., SWOT, SSH, RMS). Throughout the manuscript, the descriptions are written in colloquial expressions (e.g., "we see"). Therefore, it is necessary to revise the entire manuscript to ensure scientific and objective descriptions.
The authors use the expression "data assimilation (DA) constraints model" in this manuscript. However, DA does not make any corrections to the model source code except when it is used for parameter estimation; therefore, this expression is inappropriate.
L32: SWOT data in 2023 became available in early 2024.
L37: The use of "very" and similar expressions should be avoided as they lack objectivity.
This study focuses only on the two SSH observation networks (two WiSA and 12 Nadir satellites) planned by the ESA. However, the observation coverages and funding required to construct these networks are substantially different. Even if the ESA plans are currently limited to these two networks, additional sensitivity experiments are necessary to determine the most efficient observation network. Since OSSEs enable the evaluation of various unconstructed observation networks, it is essential to leverage this advantage.
L46: Toy models and low-resolution models such as Lorenz-96 are used in the nature run.
The "control run" in this manuscript is included in the OSSEs. It would be better to incorporate the control run into the OSSEs and avoid using the term "control run" throughout the manuscript.
L74: Better to add 3D-VAR based before NEMOVAR.
Please specify the major differences between the NEMO models used in the nature run and the OSSEs in the 2nd paragraph of subsection 2.1.
Please add a citation for "the real-time atmospheric analysis produced at ECMWF" in L104-105.
To confirm whether the data assimilation systems are functioning correctly, it is essential to show the prescribed observation error variance and covariance. In this manuscript, however, there are only citations of previous papers and almost no specific information. This also applies to background observation errors.
Please specify “since we do not … Sentinel altimeters” in L120-121.
Since observation coverage significantly impacts the analysis accuracy, it is essential to indicate the differences in observation coverage (percentage) among the OSSEs.
Please modify the description in L178-180 for readers to understand.
In the third paragraph of subsection 2.4, it is unreasonable to compare the accuracy between the practical operational systems of FOAM and virtual OSSEs because these frameworks are completely different. It is unnecessary to compare these results, and it would be better to remove them.
Please specify “incomplete” observation sampling in L214.
“significant” and “significantly” can be used only if the statistical tests are conducted.
In this paper, the objective is to evaluate the impacts of 12 Nadir and 2 WiSA satellites on accuracy. However, most figures, especially those depicting spatial patterns, do not illustrate their differences, which is inconsistent with the stated objective.
Please provide an explanation for why both the 12 Nadir and 2 WiSA experiments result in degraded SSH accuracy around the Antarctic region.
Please modify the descriptions in Lin 231-233.
Adding SSH contours to Figure 4 would enhance clarity by illustrating the positions of fronts and eddies.
Please specify the reasons for the degradation of temperature and salinity accuracies in the 12 Nadir and 2 WiSA experiments.
In Figure 5, it would be beneficial to include the temperature and salinity RMSEs in addition to the improvement ratio. To enhance clarity, consider specifying the use of different scales on the x-axis for each panel or using consistent scales across all panels.
Since geostrophic velocities dominate most of the global ocean, it may not be necessary to present detailed validation results of surface currents in subsection 3.3. It would suffice to only describe that the results of sea surface currents are qualitatively similar to those of SSH.
No label for the color scale in Fig. 7.
Please specify reasons for the differences in spatial patterns between SSH and surface currents (Figs. 3 and 7, respectively).
L267: Please specify reasons why the degradation signals are not distributed uniformly across the entire equatorial regions.
In Figure 8, it is unnecessary to display both the monthly mean errors and RMSEs.
The definition of the power spectral density (PSD) score appears not to be reasonable. It's unclear whether the PSD is calculated in the spatial or temporal directions, and the rationale behind calculating the ratio between the PSD of SSH error among OSSEs and that of true SSH is unclear. Since this definition is relevant to all descriptions in subsection 3.4, I will read the remaining descriptions at the next round.
The reasons for using models with different horizontal resolutions of 0.25° and 1/12° should be specified. If the results from both resolutions are qualitatively the same, it may not be necessary to show the results from the 0.25° resolution.
There are no universally accepted rules for using "/" to denote interchangeable expressions, as seen in "2/7% reduction in the u/v RMSE" in L340.
The observation error variance is likely different among the three experiments (2WISA, 2WISA_CORR_TRIM, 2WISA_CORR), indicating a failure in this study to explore the impacts of observation covariance errors.
Generally, the discussion and conclusion should be delineated separately. Moreover, a conclusion spanning over 2 pages is excessively long. Given that it does not succinctly summarize the results, I will review this section in the next revision.
Citation: https://doi.org/10.5194/egusphere-2024-756-RC2 - AC2: 'Reply on RC2', Robert King, 02 Jul 2024
Status: closed
-
RC1: 'Comment on egusphere-2024-756', Anonymous Referee #1, 01 May 2024
Review of “Assessing the impact of future altimeter constellations in the Met Office global ocean forecasting system” by R.R. King et al.
General Comments:
This study presents a series of Observing System Simulation Experiments performed to assess the relative benefits of two proposed satellite altimetry approaches. The first involves 12 nadir altimeters (i.e. similar to conventional along-track satellite altimetry; NADIR), whereas the second approach would use two wide-swath satellite altimeters (2WISA) similar to the current SWOT mission launched at the end of 2022.
The manuscript is clearly written and provides a well-justified methodology. Results are assessed using both standard approaches together with more sophisticated time and spatial scale dependent scores. A state of the art high-resolution ocean forecasting system is used for the study thereby providing an excellent assessment of the potential impacts of the two proposed altimeter constellations.
There are several issues I feel that the authors should address prior to publication:
- A critical aspect of the study is how the errors are estimated and applied to synthetic observations and how they are specified in the assimilation system. In particular, the errors applied to observations used for NADIR and 2WISA are poorly described and lack details regarding the amplitudes of perturbations applied. The degree to which the system is constrained to the nature run will depend not only on observational coverage, but also on the observational errors applied. This could have an impact on both the RMS errors as well as the spectral properties and ability to make use of the smaller scales present in the wide swath measurements. There are several specific suggestions provided below on how this aspect could be improved.
- A significant degradation is found for 2WISA experiment in the Northeast Pacific Ocean as compared to the Control (with 2 nadir altimeters). The impact of this degradation is visible in many of the figures, and the authors have clearly done their best to avoid this feature in their interpretation of results (e.g. Fig. 11 that only shows results for the Atlantic Ocean). In the conclusions, the authors claim that this is related to the bias correction method for sla observations and that the impact is isolated in the northeast Pacific, but admit they were unable to explain why this occurs. I can understand that correcting and rerunning the experiments would be a costly and time-consuming effort, however, its difficult to be sure that this issue is not the cause of the reduced benefits found for 2WISA. Especially when the results presented are opposite to similar studies published previously (as noted in the conclusions). This issue is compounded by the fact that very little information is provided to describe the procedure. The authors should provide a clearer justification and description, together with some evidence to support their claim that the impact does not affect results in other regions.
- Finally, I agree with the authors conclusion that the spatial and temporal sampling differences between the 12 nadir and 2 wide swath approaches is likely the primary source of the differences presented. However, it would be helpful to illustrate this aspect more clearly. Fig. 1 presents differences in coverage between 1 day and 7 day windows. However, it would be useful to show how the 21-day repeat coverage of 2WISA affects the assimilation statistics. For example, differences could be shown for a small region (e.g. the size of spatial correlation scales applied?) highlighting the intermittence in 2WISA experiment as compared to NADIR. Are the SSH errors for 2WISA smaller than NADIR following the overpasses and then grow with time? If so, this would demonstrate that the wide-swath data is being correctly assimilated and that the reduced impact is indeed the sampling. On the other hand, if the problem is the observational error specification (or the SLA bias correction procedure), we would see that even following an overpass of a wide-swath altimeter the 2WISA experiment would fail to constrain smaller scales. Additionally, maps showing differences in increments over the Gulf Stream region could also reveal if differences between NADIR and 2WISA are due to the presence of SLA biases requiring constant increments at each cycle to maintain the system close to the nature run as opposed to correcting chaotic turbulence (which then grows between cycles).
Specific Comments:
L20: Technically the satellites measure SSH. SLA is obtained by removing a mean SSH surface. As SSH is referred to later in the paper, it would be better to use consistent language throughout.
L25: The following sentence uses the term “nadir”. Include it here to make it clear what this means.
L39: It would be better to define WiSA when SWOT is first mentioned (or not at all). It is not clear to me the benefit of using an acronym for Wide-swath altimetry and not for nadir altimetry when the two alternative observing system approaches being considered are nearly the same length. Moreover, the use of WiSA is used inconsistently throughout the paper, with “wide-swath altimetry” used at times and WiSA other times. It is also somewhat confusing to have an acronym for WiSA and the name of one the experiments 2WISA. I would propose to remove the WiSA acronym and just say “wide-swath altimetry” and “nadir altimetry” to be clear.
L101: It would be good to note that the sea ice model is also different.
L112: A brief description of how the observation errors are generated should be added along with details concerning the amplitude of the errors and whether spatially-correlated errors are introduced. It would be helpful to note here the different types of error (instrumental, representativeness) and what is being estimated here.
L124: It would be helpful to elaborate on what you mean by “realistic errors”. For the baseline nadir altimeters what amplitude is applied for the error?
L134: Final L3 altimetry products typically used for assimilation are corrected for many of the raw satellite errors (tides, DAC, longwave, wet tropospheric). How do you deal with this?
L134: You mention here that only uncorrelated errors are Karin noise and residual path delay error. However, later on line 344, it is noted that Karin and wet tropospheric errors are used. Also, it would be helpful to provide the amplitude of the perturbations applied to the synthetic observations to simulate errors.
L135: Superobbing to 10km. But nadir is at 6km? Why not make it the same? This choice reduces the along-track resolution of the wide-swath data and could affect the extent to which small scales are constrained.
L138: Is nadir data affected by SWH?
L147: Why mention NEMO version number and not CICE version number? Please add version number for the latter.
L156: Here you mention SSH observations. It would be good to be consistent with the introduction and stick to either SLA or SSH (unless there is a reason to differentiate).
L158: “…all assimilated together, and …”. Remove comma. The rest of the sentence isn’t very clear. It would be good to reword.
L164: It would be helpful to state the length scales used as they have a direct impact on how the altimetric information will affect the model solution.
L175: What high-frequency errors are being referred to here? DAC? A bias should have a time mean, but it is mentioned that high-frequency signals are removed from observations by the simulator. Which is it? Additionally, problems with this bias correction term are used to explain the poor performance of 2WISA experiment in the Northeast Pacific Ocean. As such, it would be appropriate to provide additional detail. Indeed, it would be good to add a comment here regarding the role of this correction in affecting the results later on.
L186: Why not run for a full year? Evaluating over only Jan. to July will create a hemispheric bias with respect to the sea ice cover (see comments regarding Fig. 3).
L190: Why does the 2WISA experiment not include Sentinel 3a/b? It would allow us to see the impact of adding the 2 wide swath altimeters directly. As it stands, a comparison of 2WISA and Control would include the impact of the 2 wide-swath altimeters, but also the impact of removing 2 nadir altimeters (Sentinel 3a/b). It seems the removal of these two altimeters has an important impact on the results. As such, an additional simulation with only 1 altimeter (Sentinel6) would provide a means to separate the effects (or rather, with 2 wide-swath altimeters and sentinel3a/b).
L205: It is not clear to me why the authors are comparing min/max values using innovations from the operational system to min/max values using the full grid in the OSSE. Why not assess min/max values from the OSSE innovations? This way the sampling would be the same.
L209: It is standard practice to produce an OSE prior to the OSSE to verify the OSSE framework provides an equivalent response (e.g. when withholding altimetry data). Has an OSE been performed? For example, it would be useful to see the impact of withholding sentinel 3a/b in an OSE and in the OSSE (see comment regarding line 190). This may help to explain some of the areas of degradation seen in the 2WISA experiment.
L220: It would be helpful to know how the total number of observations differs between NADIR and 2WISA experiments. Fig. 1 gives a qualitative sense to this, but total number of observations would give an idea how well the system is able to benefit from the information. Also, the choice to apply a “superobbing” of the data to a 10km grid will affect this number.
L220 (Fig. 2): It would be helpful to provide statistics for other regions, especially the Gulf Stream region since this is the focus of Fig. 4. The global statistics will be strongly affected by the strong signal found under ice (see comment below) thereby biasing the overall (ice-free) results. The global results are also affected the problem in the northeast Pacific.
L223: (Fig. 3): Why only show monthly mean for July? Given that the statistics are quite stationary (apart from initial few analyses), using fields for the full simulation would provide more robust statistics.
L225: Why is there an impact under the sea ice? Were synthetic ssh data used even where there is sea ice? Since this is not typically done in the operational systems it should be rejected in the OSSE as well. In July there should be considerable sea ice cover in the southern ocean. As a result, a change in the SSH data sets assimilated shouldn’t have an impact under the ice. If the study has used data under the sea ice this needs to be mentioned and the impact of this choice discussed in detail as it appears to have a first order impact on the study results.
L226: The degradation in the northeast Pacific is quite unusual. The cause for this feature should be mentioned here as it is not associated with the wide-swath data themselves but rather a suggested problem in the SLA bias correction. Without this explanation, it suggests there is a problem in the experimental setup and undermines the reliability of the findings.
L235: It would be helpful to have a timeseries of RMS error over a small region to illustrate the point being made here about the sporadic sampling in time. We should see smaller errors following assimilation which grow between overpasses. It would also help to demonstrate the WISA data are being assimilated correctly and that it is indeed the time sampling the issue.
L238: “The assimilation of 12 nadir altimeters…”. In fact, the NADIR experiment assimilates 13 nadir altimeters doesn’t it? (i.e. 12 + Sentinel6). It is mentioned the Control has three (2+Sentinel6), so a consistent nomenclature should be used.
L244 (Fig. 5): Why do both experiments show a strong (20%) degradation in temperature below 1000m in the Gulf Stream region?
L258: Surface currents will be strongly affected by the winds applied. Would it not be more interesting to assess just below the surface (e.g. 15m depth) where a large impact on geostrophic currents should be visible? Use of 15m depth would also allow greater potential transferability of results to actual comparisons with drogued drifters.
L276 (Fig. 8): A significant difference between the experiments is in the representation of a large feature in the top right of the panels (these panels should have lat/lon labels). If we consider the panels a 7x7 grid, the feature would be in X=2:3, Y=5. The NADIR run captures well this large feature whereas the other runs look more diffuse (possibly due to higher variability?). This feature appears to be part of the Gulf Stream mean flow. This suggests that its representation may be part of an improvement in bias, rather than having to do with finer resolution.
L283. Should read “It is also clear …”
L291: Over what region are the PSD scores calculated? Could you define ‘Gulf Stream region’ more precisely as this is used loosely throughout the paper. Does this region include land (e.g. the ‘Gulf Stream region’ shown in Fig. 4 or the one used in Fig. 8). If not and a smaller domain with only ocean points is chosen, how are PSD scores near 10deg obtained as they would have very few cycles.
L303: Ballarotta et al. (2019) aim to assess the effective spatial resolution of their procedure to re-grid altimetry observations. The use of this technique here has a somewhat different connotation. The scale at which the PSD of the error is larger than ½ of PSD of the observations would be more appropriately referred to as the limit of constrained scales. It is correct to note that Ballarotta et al. refer to this as “effective resolution”, but it should not be interpreted as so here. For example, there are areas for which the data assimilation system is having trouble assimilating the altimetry observations resulting in larger errors (e.g. in the northeast Pacific Ocean and in the Arctic around 180deg). In the latter, Fig. 10 suggests an “effective resolution” of over 500km! It would be more accurate to refer to this as the limit of constrained scales. In this case, it would highlight that the assimilation system isn’t constraining the SSH to any measurably degree in this region of the Arctic.
L312: ‘…at each point’. If there are a series of 10x10deg boxes every 1deg, then the corresponding PSDs and ratios are considered to correspond to the center of the 10x10deg box. Is this what ‘at each point’ refers to? The center of the 10x10deg box? Perhaps reword to make this more clear.
L312: ‘…where the ratio described above was 0.5’. Reword to something like this ‘…where the ratio described above is equal to 0.5’, or ‘crossed the threshold of 0.5’.
L313: The anomaly in the northeast Pacific is glaringly obvious in Fig. 10, yet there is no comment. This unexplained issue puts in question the validity of the rest of the study.
L351: For these three experiments is the only change the introduction of different perturbations to the synthetic data? Is any change made to the data assimilation system to account for correlated errors? Is the observation error variance changed in the assimilation system? Also, it would be helpful to provide details regarding the amplitude of the correlated errors and how they were determined. Moreover, while its noted that efforts are underway to develop corrections to these correlated errors its not clear the extent to which some correction is used here. Without any correction, the errors would be extremely high (e.g. greater than 50 cm). While the correlated errors may not be completely removed, a reasonable approximation would be to use assume some correction of the correlated errors. So what is used here and what is this estimate based on?
L400 “…showed an overall…”. Typo.
L404: “Although we were unable to find why this affected only the 2WISA experiment and not the NADIR, we found that the effect was localised to the region of degration in the northern Pacific and did not affect the impact of assimilation in other regions, nor the overall order of the impact or our conclusions”. I find this issue somewhat troubling. If the authors were unable to identify the cause of the problem, how can they be sure it is localized in the north Pacific? As noted above, leaving it to the conclusions to offer an explanation for this issue undermines the reliability of the results. It would be better to note earlier in the text that this issue is present in the results and efforts have been made to provide an assessment of the impact of the satellite constellations that are not affected by this issue.
L414: The main difference is not necessarily the assimilation window, but rather the observations used in the analysis. The system used in Benkiran et al. (2024) produces daily analyses using observations from days in the past and the future, similar to a Kalman smoother approach. Could observations from days before and following the analysis be included in the FOAM system in a similar manner? It would be helpful to clarify the text on this point.
L420: The issue of correlated errors is an important one, but various aspects of how it is approached here require further clarification. The precise values of errors applied are not described nor is how the assimilation system is modified in consequence.
L448: Craig Donlon appears as a co-author, but his contribution is not indicated.
Citation: https://doi.org/10.5194/egusphere-2024-756-RC1 - AC1: 'Reply on RC1', Robert King, 02 Jul 2024
-
RC2: 'Comment on egusphere-2024-756', Anonymous Referee #2, 17 May 2024
=== General comment
This paper conducts OSSEs with a 3D-VAR-based eddy-resolving system to compare the impacts of 12 Nadir and 2 WiSA satellites on accuracy. While OSSEs are useful for evaluating various yet-to-be-constructed observation networks, this study is limited to only three experiments: assimilating standard observations, standard observations plus 12 Nadir satellites, and standard observations plus 2 WiSA satellites. To comprehensively determine the most efficient observation networks, more diverse experiments are necessary. Additionally, it would be beneficial to include information on observation coverage and funding considerations for constructing these networks.
Moreover, the manuscript uses colloquial expressions, lacks fundamental details about the data assimilation systems, and does not employ statistical tests. These elements are essential for a scientific paper. Therefore, I conclude that the current paper does not meet the criteria for proceeding to the review process and expect significant revision in the next manuscript.
=== Specific comment
Even in the abstract, there are grammatical errors (e.g., "now able to" in L4 and "greatest" in L10) and unclear abbreviations (e.g., SWOT, SSH, RMS). Throughout the manuscript, the descriptions are written in colloquial expressions (e.g., "we see"). Therefore, it is necessary to revise the entire manuscript to ensure scientific and objective descriptions.
The authors use the expression "data assimilation (DA) constraints model" in this manuscript. However, DA does not make any corrections to the model source code except when it is used for parameter estimation; therefore, this expression is inappropriate.
L32: SWOT data in 2023 became available in early 2024.
L37: The use of "very" and similar expressions should be avoided as they lack objectivity.
This study focuses only on the two SSH observation networks (two WiSA and 12 Nadir satellites) planned by the ESA. However, the observation coverages and funding required to construct these networks are substantially different. Even if the ESA plans are currently limited to these two networks, additional sensitivity experiments are necessary to determine the most efficient observation network. Since OSSEs enable the evaluation of various unconstructed observation networks, it is essential to leverage this advantage.
L46: Toy models and low-resolution models such as Lorenz-96 are used in the nature run.
The "control run" in this manuscript is included in the OSSEs. It would be better to incorporate the control run into the OSSEs and avoid using the term "control run" throughout the manuscript.
L74: Better to add 3D-VAR based before NEMOVAR.
Please specify the major differences between the NEMO models used in the nature run and the OSSEs in the 2nd paragraph of subsection 2.1.
Please add a citation for "the real-time atmospheric analysis produced at ECMWF" in L104-105.
To confirm whether the data assimilation systems are functioning correctly, it is essential to show the prescribed observation error variance and covariance. In this manuscript, however, there are only citations of previous papers and almost no specific information. This also applies to background observation errors.
Please specify “since we do not … Sentinel altimeters” in L120-121.
Since observation coverage significantly impacts the analysis accuracy, it is essential to indicate the differences in observation coverage (percentage) among the OSSEs.
Please modify the description in L178-180 for readers to understand.
In the third paragraph of subsection 2.4, it is unreasonable to compare the accuracy between the practical operational systems of FOAM and virtual OSSEs because these frameworks are completely different. It is unnecessary to compare these results, and it would be better to remove them.
Please specify “incomplete” observation sampling in L214.
“significant” and “significantly” can be used only if the statistical tests are conducted.
In this paper, the objective is to evaluate the impacts of 12 Nadir and 2 WiSA satellites on accuracy. However, most figures, especially those depicting spatial patterns, do not illustrate their differences, which is inconsistent with the stated objective.
Please provide an explanation for why both the 12 Nadir and 2 WiSA experiments result in degraded SSH accuracy around the Antarctic region.
Please modify the descriptions in Lin 231-233.
Adding SSH contours to Figure 4 would enhance clarity by illustrating the positions of fronts and eddies.
Please specify the reasons for the degradation of temperature and salinity accuracies in the 12 Nadir and 2 WiSA experiments.
In Figure 5, it would be beneficial to include the temperature and salinity RMSEs in addition to the improvement ratio. To enhance clarity, consider specifying the use of different scales on the x-axis for each panel or using consistent scales across all panels.
Since geostrophic velocities dominate most of the global ocean, it may not be necessary to present detailed validation results of surface currents in subsection 3.3. It would suffice to only describe that the results of sea surface currents are qualitatively similar to those of SSH.
No label for the color scale in Fig. 7.
Please specify reasons for the differences in spatial patterns between SSH and surface currents (Figs. 3 and 7, respectively).
L267: Please specify reasons why the degradation signals are not distributed uniformly across the entire equatorial regions.
In Figure 8, it is unnecessary to display both the monthly mean errors and RMSEs.
The definition of the power spectral density (PSD) score appears not to be reasonable. It's unclear whether the PSD is calculated in the spatial or temporal directions, and the rationale behind calculating the ratio between the PSD of SSH error among OSSEs and that of true SSH is unclear. Since this definition is relevant to all descriptions in subsection 3.4, I will read the remaining descriptions at the next round.
The reasons for using models with different horizontal resolutions of 0.25° and 1/12° should be specified. If the results from both resolutions are qualitatively the same, it may not be necessary to show the results from the 0.25° resolution.
There are no universally accepted rules for using "/" to denote interchangeable expressions, as seen in "2/7% reduction in the u/v RMSE" in L340.
The observation error variance is likely different among the three experiments (2WISA, 2WISA_CORR_TRIM, 2WISA_CORR), indicating a failure in this study to explore the impacts of observation covariance errors.
Generally, the discussion and conclusion should be delineated separately. Moreover, a conclusion spanning over 2 pages is excessively long. Given that it does not succinctly summarize the results, I will review this section in the next revision.
Citation: https://doi.org/10.5194/egusphere-2024-756-RC2 - AC2: 'Reply on RC2', Robert King, 02 Jul 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
350 | 76 | 29 | 455 | 23 | 17 |
- HTML: 350
- PDF: 76
- XML: 29
- Total: 455
- BibTeX: 23
- EndNote: 17
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1