the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Exploring Non-Gaussian Sea Ice Characteristics via Observing System Simulation Experiments
Abstract. The Arctic is warming at a faster rate compared to the globe on average, commonly referred to as Arctic amplification. Sea ice has been linked to Arctic amplification and gathered attention recently due to the decline in summer sea ice extent. Data assimilation (DA) is the act of combining observations with prior forecasts to obtain a more accurate model state. Sea ice poses a unique challenge for DA because sea ice variables have bounded distributions, leading to non-Gaussian distributions. The non-Gaussian nature violates Gaussian assumptions built into DA algorithms. This study configures different observing system simulated experiments (OSSEs) to find the optimal sea ice and snow observation subset for assimilation to produce the most accurate analyses and forecasts. Findings indicate that not assimilating sea ice concentration observations while assimilating snow depth observation produced the best sea ice and snow forecasts. A simplified DA experiment helped demonstrate that the DA solution is biased when assimilating sea ice concentration observations. The biased DA solution is related to the observation error distribution being a truncated normal distribution and the assumed observation likelihood is normal for the DA method. Additional OSSEs show that using a non-parametric DA method does not alleviate the non-Gaussian effects of the sea ice concentration observations, and assimilating sea ice surface temperatures have a positive impact on snow updates. Lastly, it is shown that perturbed sea ice model parameters, used to create additional ensemble spread in the free forecasts, lead to a year-long negative snow volume bias.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(17519 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(17519 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-96', Anonymous Referee #1, 01 May 2023
Review of “Exploring Non-Gaussian Sea Ice Characteristics via Observing System Simulation Experiments” by Christopher Riedel and Jeffrey Anderson
General Comments:
This paper describes a series of idealized data assimilation experiments using two different ensemble data assimilation approaches and simulated observations applied to a sea ice model. The issue of non-Gaussian distributions naturally appears due to the bounded nature of all of the important model variables. This is compounded by the fact that values for SIC near the extremes of 0 and 1, where the non-Gaussian effects are most acute, actually tend to dominate relative to values closer to 0.5. In general, I find the paper to be a valuable contribution to this field by using an idealized experimental setting to address some basic aspects of sea ice data assimilation. However, I also have several general and more specific concerns, as detailed below, which I feel need to be addressed before the final version of the paper is published. One of the more important concerns (detailed below) is that the observing network is highly unrealistic, especially for SIC observations, and I feel this may have a direct impact on the study’s conclusions that state the assimilation of SIC observations degrade the results.
At least from my perspective, it seems that some fundamental issues regarding non-Gaussian distributions, especially for SIC, should be considered and acknowledged in this study. I expect some of the apparent bias in the SIC scores presented in the paper may be a result of using the ensemble mean when computing the bias. This choice of statistic may not be appropriate when the distribution is heavily skewed, as would be the case for SIC when the truth is close to either of the extremes of 0 or 1 (which it very often is). I wonder if even a data assimilation procedure that perfectly handled the non-Gaussian aspects of the problem would still result in a biased ensemble mean, since the distributions would necessarily be skewed. A discussion on what a “perfect” data assimilation procedure (i.e. something closely approximating Bayes theorem) would produce would be very helpful for the reader. Also, I am interested to see if the ensemble median SIC is less biased than the ensemble mean and may provide a “better” measure than the mean for skewed distributions (though I admit the definition of “better” is not completely clear). This would not require additional experiments to be performed, just recomputing the bias scores by using the ensemble median, instead of the mean.
The previous paragraph concerning apparently biased distributions relates to a more general consideration of whether or not it makes any sense to evaluate a bias or using any other way of evaluating a single quantity, such as the mean, that has been extracted from the ensemble distribution. More general approaches, such as the continuous ranked probability score (CRPS), attempt to evaluate the accuracy of the entire ensemble distribution and not a single value obtained from the distribution. Please consider using such an approach for evaluating the resulting ensembles or explain why it hasn’t been done.
Specific Comments:
Line 65: “…if there is an optimal data assimilation setup…”: This claim seems much too general, since the optimal configuration will likely depend on many details of a particular application of DA to sea ice. One such detail is the observing network, especially as mentioned elsewhere with respect to the unrealistic spatial distribution of SIC observations. Please rephrase this here and elsewhere.
Line 83: “One unique aspect of the EAKF…”: This is misleading, since all variants of the ensemble Kalman filter use flow-dependent background-error covariances, not just the EAKF. Also, other data assimilation approaches, such as ensemble-variational approaches, use flow-dependent background-error covariances. Please rephrase.
Line 91: “…poor representation of model errors…”: Presumably in an OSSE context you can perfectly represent any model errors that you choose to include in the experimental setup. Therefore, if inflation is still needed in such a context, then it must be serving a different purpose than accounting for model error. Please give some explanation for what these purposes are.
Line 133: “…15% of the true values of SIC…”: It is really 15% of the true values of SIC? Meaning that open water and very low concentration values are perfectly observed (i.e. error standard deviation close to zero)? Or is the standard deviation simply 15% (and not dependent on the true SIC)?
Line 134: “The locations for all synthetic observation types…”: For all observation types? This is very unrealistic for SIC which is well observed almost everywhere by passive microwave satellite observations every day. CryoSat-2 is only used for ice thickness measurements. The retrieval process for obtaining ice thickness from these measurements also depends heavily on having accurate snow depth information, which is not well observed currently by any instruments and therefore the ice thickness measurements can have very high levels of uncertainty. I think that for this study to be relevant, a more realistic observing network must be used for all observation types (and also realistic values for observation uncertainties).
Line 136: The acronyms such as AICEN, VICEN, VSNON, SNWD, etc. are very non-intuitive and difficult to remember. These look like FORTRAN variable names, which are not necessarily appropriate for a scientific paper. Please consider variable names or non-acronym labels that readers will more easily remember and recognize. For example, if a quantity is a function of the thickness category, then this could be represented by a subscript of a variable corresponding to the thickness category index.
Line 226: “…not assimilating SIC observations improves most forecast metrics…”: A more realistic (i.e. much denser) observing network for SIC would likely lead to more improvement when assimilating SIC since it would also lead to less ensemble spread and therefore reduced non-Gaussian effects. Also, as already mentioned, it's not clear if the observation error standard deviation for SIC observations is state dependent and, if so, if this could cause some of the resulting negative bias since low SIC observations will obtain more weight than high SIC observations.
Citation: https://doi.org/10.5194/egusphere-2023-96-RC1 - AC1: 'Reply on RC1', Christopher Riedel, 31 Jul 2023
-
RC2: 'Comment on egusphere-2023-96', Anonymous Referee #2, 29 May 2023
General comments: This paper evaluates the effect of assimilating different sea ice variables and using different approaches ranging from post processing to ensemble Kalman filter to rank histogram filter. All experiments are performed in an OSSE framework so that truth is known and the observation's or assimilation method's impact can be properly separated and evaluated. This is a novel attempt at a very challenging sea ice DA problem. I think there is potential for this paper to resolve some sea ice DA performance issues and contribute to future prediction systems. Unfortunately, I found many issues with experiment design, including the particular choice of observation error that might have led to the negative results in assimilating concentration. The discussion about using EnKF and RHF for assimilation is also quite counter-intuitive. I feel there is a missed opportunity here to make a nice proof-of-concept demonstration of assimilating bounded concentration variables using the more suitable non-Gaussian methods. But I am surprised to find that RHF actually "degrades" the performance. I would encourage the authors to rethink the experiments results, even giving it another attempt with my suggestions below (if time allows). There are also quite many minor language issues (grammar errors and unclear expressions) in the manuscript, and I tried to list them here too. I suggest the authors to proofread more carefully in the next revision.
Major Issues:
1. The biggest issue in the experiment design is the choice of "15% error for the true values of SIC", Line 133. Is this the right error model for SIC observations? SIC=0 and SIC=1 are two scenarios both with high accuracy in real data. Far away from the marginal ice zone, the observation and model should have good agreement on SIC and there is little need for any update. To reflect these scenarios, ensemble spread in SIC should be near zero for all the members have similar SIC values in those grid points, and for observation the error should also be near zero. However, 15% error means SIC=0 has zero error but SIC=1 has maximum error of 0.15. You mean SIC=0.99 observation can have an uncertainty bar of +-0.15?? This is certainly not realistic. In fact, I suspect that this specified observation error is causing all the problems in the following results (how assimilation of SIC degrade forecast, causing negative bias inside ice cover).
2. In reality, the model simulated SIC has most uncertainties in the marginal ice zone, the concentration elsewhere should be quite accurate. What's more uncertain is the thickness of ice and snow, and it is especially difficult to uncover the snow information from real satellite observations. Because altimetry will only measure the total freeboard, which is how much ice stick out of water level, it includes both ice thickness and snow depth. Other technique can infer snow depth information from heat signatures, but the observation has large uncertainties with a lot of assumptions made in the inference. Therefore a 0.1 m SIT error and 10% error for snow depth might be too optimistic. Thin ice is also not well represented in the model (given only category 1 is below 1 m), I'm not sure if uncertainty should be the same for thin (young) and thick ice. I suggest revisiting these uncertainties and checking against typical operational numbers.
3. The definition of sea ice model and observed variables are quite confusing. There are "concentration" and "thickness", but later also "area" and "volume". My understanding is that SIC is equal to the integrated AICEN. Since volume = area x thickness, VICEN is actually not an independent variable from AICEN, but they are listed as separate variables in Table 1. The observation-state relation should be formulated in equations to be clear: how to compute SIT from AICEN (it's an average over 5 categories?) and SNWD is also confusing as how it relates to the model VSNON. There are at some point interchangeable use of the term "area" and "concentration", along with AICE and SIC. I suggest to define the state variables using easier-to-read math formula, such as A_{ice, n} for AICEN and h_{ice,n} for the thickness categories, then for observed quantities you can give equations such as SIC = \Sum_{n=1-5} A_{ice,n}
Also, for the experiment names, it gets difficult to follow after the first three, you also keep repeating what each experiment does in the results section too. Can you provide more intuitive names, such as "Control", "EAKF_ConcThick", "Add_SnowD", and "RHF_ConcThick"?
4. Metric for forecast verification is chosen to be the total sea ice area and volume. These are integrated variables, so that local errors of positive and negative biases might get cancelled even before the error diagnostics. As you are performing an OSSE and you can evaluate against the truth member, why don't you choose a metric more suitable for the full 2D fields? You can even summarize errors separately for different areas (inside ice cover and near ice edge).
At some point I was wondering if the ensemble spread is representing the correct amount of forecast errors for the ensemble mean versus the truth. It seems a measure for the skill of the ensemble is still missing in the current diagnostics. For sea ice, there are also a metric similar to the CRPS, called Spatial Probability Score (Goessling and Jung 2017), that you can try to add.
Minor Issues:
Line 45: "Common sea ice descriptive quantities": Here you introduced concentration and thickness, but in the paper you are dealing a lot with area, volume and with respect to multi-thickness-category integrals. Can you provide some connection early on?
Line 65: "... if there is an optimal data assimilation setup...": I would not use the word "optimal" here, since it is unclear whether the model or the observations are near the optimal condition to provide information about the true sea ice states.
Line 77: "...follow closely to the optimal settings Zhang et al...": Did you use exactly their setting or did you changed anything? Please describe more clearly. Again, I would refer the "optimal setting" as "setting that performed the best".
Line 81: "Kalman 1960": This is the reference for the Kalman filter, you should cite the EnKF reference (Burgers, or Evensen, or Carrassi's review, or even Houtekamer and Zhang review paper).
Line 83: "...accurate estimate...": "best estimate" or "best guess" will be better here.
Line 84: "flow-dependent background-error covariance": You should point out that it is the ensemble that allows the estimation of a flow-dependent covariance, while the older methods such as "optimal interpolation" or 3DVar only employs static covariance.
Line 86: "While the RHF...", check grammar
Line 87: "however, that application is not applied in this study": Well, if not then you shall not mention this in the main methodology, you can discuss if this application is worth trying in future studies in the discussion, if that's what you are implying.
Line 88: "horizontal localization": a more accurate term is "covariance localization", then you can comment that you only localize in the horizontal directions.
Line 91: "inflating the prior background fields": its not the full fields that are directly inflated, rather the "ensemble perturbations" are inflated.
Line 97: "sea ice thickness, which is represented by the product of sea ice volume and sea ice area": this is wrong...sea ice volume = area x thickness.
Line 99: "five categories with lower bounds of...": Why are these exact numbers chosen? Are you following some classic CICE configurations? I've seen other models using categories that are more focusing on thinner ice (0.1, 0.3, 0.7, 1.0, then 3.0), your values here seems to lean on the thicker categories.
Line 100: "provides an unique challenge": "pose a challenge" will be more suiting.
Line 110: "non-Gaussian impacts": could you be more clear?
Line 113: "three different parameters were perturbed that impact...": I would discuss the three parameters in separate sentences and be more clear. Please also list the range of values for the perturbation (one standard deviation around the true value), so that people can try to reproduce the experiments.
Line 114: "ocean-ice drag coefficient": The ice motion is mostly driven by atmospheric surface winds, why did you perturb the ocean drag coefficient instead? Wouldn't it be more responsive if you perturb the air-ice drag coefficients so that surface wind uncertainties can also be partially included in the ensemble perturbations?
Line 116: "equilibrium state"? The model will experience seasonal cycles with the periodic forcings, how can it reach an equilibrium? I think you meant to spinup the model to reach the correct energetics, climatology?, or seasonal variability.
Line 121: "negatively biased": Are all sea ice and snow variables negatively biased? Even for each thickness categories?
Line 130: "sea ice and snow quantities have both ...bounds in their representation": "both" sounds weird here, how can a variable have both single and double bounds?
Line 135: "10-second CryoSat-2": can you provide more details? Did you use 10 seconds worth of data amount from CryoSat-2?
Line 136: "AICEN, VICEN, and VSNON": I don't think you have defined these names yet.
Line 141: "update the sea ice area in different categories": sea ice "area" and "concentration" seems to be used interchangeably, but it is confusing for the readers outside of sea ice community.
Line 142: "allow sea ice conc. and volume to be updated": are the AICEN and VICEN two independent variables? I thought that VICEN = AICEN x hicen, and hicen is given (the five categories), so if you update concentration (AICEN) you already have the VICEN updated. Could you clarify?
Line 145: In eq. 1, hsnon is defined as "category-based snow thickness". Are they given by the same 5 categories for sea ice thickness? It is not capital letter, which means they are not prognostic variables? How does this even work? Do you have hsnon scale with hicen (ice thickness categories)?
Line 147: for experiment 2, you are missing another important clarification here. Does all observation types (SIC, SIT and SNWD) update all model state variables (VICEN, AICEN, VSNON)? It maybe by default in DART that all cross-variable updates are kept, but for other systems there are often ad-hoc cross-variable localization. For example, not allowing SIT and SNOWD to change AICEN...so please carefully provide details. Table 1 doesn't have these information either, you need to state here more explicitly.
Line 150: "non-parametric data assimilation": sounds like the new method for DA doesn't have parameters to tune...the non-parametric refer to the underlying distribution function. You should just say "non-Gaussian data assimilation"
Eqn. 2: this is not an equation, but a pseudo code. You can write AICEN \leftarrow AICEN * 1 / SIC.
Line 160-169: here you list three separate treatments for (1) SIC > 1, (2) SIC within bounds, but AICEN < 0, and (3) AICEN > 0 but VICEN = 0. In your code, what is the order for applying (1)(2)(3)? Does the order matter? For example, if I first get rid of negatives (2) then squeeze (1)? I'm not sure if the order matters.
Line 168: "mid-point sea ice thickness": It's not easy to understand that here you mean the thickness value for each category (bin), if you provide a SIT distribution (historgram over the 5 categories), before and after the three treatments, it would be much clearer.
Line 175: "Total sea ice area, volume, snow volume": What's their connection to the model or observation variables? Can you provide equations for how you computed these integrated values?
Line 179: "Total sea ice area will allow for better evaluation...": The reason is not clear to me.
Line 182: "Mean absolute bias and mean square error": Are they computed for the spatially-integrated values (total sea ice area, etc.) then averaged over time? Or are they also computed for the model states on each grid point, then taking both spatial and temporal averages? Again, equations may help a lot here for clarity.
Line 190: "over-prediction, under-prediction": Could you define these please?
Line 198: "In experiments 1-3 investigate...": check grammar
Line 208: "the biases are reduced for total snow volume throughout": not quite reduced for winter seasons before Jun, what happened?
Line 214: "non-Gaussian component... could be the driving factor": from results of RHF I don't think the degradation in experiments 1 and 2 is due to non-Gaussianity. If so the RHF should have improved forecasts, but it in fact degrades even more. You should look for further reasons, such as ensemble spread compounded with the value bound, or strong seasonal cycles posing a large bias in the prior mean.
Line 220: "removing the SIC observations provided a more accurate forecast": This might also suggest that the DA method is not causing the degradation, but rather the data themselves.
Figure 4: It seems most of the MSE is due to bias? MAB is almost equal to MSE. So maybe showing MAB is enough to tell the story.
Line 234: "total sea ice area ... performed poorly": based on your definition the SIC integral will include the negative biases, but if in operation people use the 15% concentration threshold to count for sea ice area (or extent) then this negative bias will not be a problem.
Line 244: "analysis increment (AI)": I would avoid using acronym that is already well-established.
Line 273: "drift away from true value towards the observed value": looks like the filter is putting too much weight on observation in this case? Does the truncation or the bound at 1 caused some error in observation uncertainty representation?
Line 286: "Applying a non-Gaussian ... can lead to erroneous observation impacts": If this is the reason for low biased analysis here, then applying RHF will reduce the bias? But you see the RHF actually increase the low bias in experiment 4... This is contradictory.
Line 293: "near the sea ice margin": the area is called "marginal ice zone (MIZ)"
Line 298 and 329 and 336: "The use of non-parametric RHF did not handle the SIC obs better": again, the issue is not in handling the non-Gaussian distribution, but the observation error model is the culprit here. The true Bayesian solution might be low-biased SIC (given the observation), and RHF is actually doing a better job in fitting the observation, hence the "worse" low bias.
Line 353: "forcing file for the truth member is an outlier": This is related to how you generated the perturbed members, if the forcing spread is large enough so that ensemble spread covers this truth member, then I would expect the filter to still work. Given here that the forcing has a near linear relation with the sea ice and snow states.
Citation: https://doi.org/10.5194/egusphere-2023-96-RC2 - AC2: 'Reply on RC2', Christopher Riedel, 31 Jul 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-96', Anonymous Referee #1, 01 May 2023
Review of “Exploring Non-Gaussian Sea Ice Characteristics via Observing System Simulation Experiments” by Christopher Riedel and Jeffrey Anderson
General Comments:
This paper describes a series of idealized data assimilation experiments using two different ensemble data assimilation approaches and simulated observations applied to a sea ice model. The issue of non-Gaussian distributions naturally appears due to the bounded nature of all of the important model variables. This is compounded by the fact that values for SIC near the extremes of 0 and 1, where the non-Gaussian effects are most acute, actually tend to dominate relative to values closer to 0.5. In general, I find the paper to be a valuable contribution to this field by using an idealized experimental setting to address some basic aspects of sea ice data assimilation. However, I also have several general and more specific concerns, as detailed below, which I feel need to be addressed before the final version of the paper is published. One of the more important concerns (detailed below) is that the observing network is highly unrealistic, especially for SIC observations, and I feel this may have a direct impact on the study’s conclusions that state the assimilation of SIC observations degrade the results.
At least from my perspective, it seems that some fundamental issues regarding non-Gaussian distributions, especially for SIC, should be considered and acknowledged in this study. I expect some of the apparent bias in the SIC scores presented in the paper may be a result of using the ensemble mean when computing the bias. This choice of statistic may not be appropriate when the distribution is heavily skewed, as would be the case for SIC when the truth is close to either of the extremes of 0 or 1 (which it very often is). I wonder if even a data assimilation procedure that perfectly handled the non-Gaussian aspects of the problem would still result in a biased ensemble mean, since the distributions would necessarily be skewed. A discussion on what a “perfect” data assimilation procedure (i.e. something closely approximating Bayes theorem) would produce would be very helpful for the reader. Also, I am interested to see if the ensemble median SIC is less biased than the ensemble mean and may provide a “better” measure than the mean for skewed distributions (though I admit the definition of “better” is not completely clear). This would not require additional experiments to be performed, just recomputing the bias scores by using the ensemble median, instead of the mean.
The previous paragraph concerning apparently biased distributions relates to a more general consideration of whether or not it makes any sense to evaluate a bias or using any other way of evaluating a single quantity, such as the mean, that has been extracted from the ensemble distribution. More general approaches, such as the continuous ranked probability score (CRPS), attempt to evaluate the accuracy of the entire ensemble distribution and not a single value obtained from the distribution. Please consider using such an approach for evaluating the resulting ensembles or explain why it hasn’t been done.
Specific Comments:
Line 65: “…if there is an optimal data assimilation setup…”: This claim seems much too general, since the optimal configuration will likely depend on many details of a particular application of DA to sea ice. One such detail is the observing network, especially as mentioned elsewhere with respect to the unrealistic spatial distribution of SIC observations. Please rephrase this here and elsewhere.
Line 83: “One unique aspect of the EAKF…”: This is misleading, since all variants of the ensemble Kalman filter use flow-dependent background-error covariances, not just the EAKF. Also, other data assimilation approaches, such as ensemble-variational approaches, use flow-dependent background-error covariances. Please rephrase.
Line 91: “…poor representation of model errors…”: Presumably in an OSSE context you can perfectly represent any model errors that you choose to include in the experimental setup. Therefore, if inflation is still needed in such a context, then it must be serving a different purpose than accounting for model error. Please give some explanation for what these purposes are.
Line 133: “…15% of the true values of SIC…”: It is really 15% of the true values of SIC? Meaning that open water and very low concentration values are perfectly observed (i.e. error standard deviation close to zero)? Or is the standard deviation simply 15% (and not dependent on the true SIC)?
Line 134: “The locations for all synthetic observation types…”: For all observation types? This is very unrealistic for SIC which is well observed almost everywhere by passive microwave satellite observations every day. CryoSat-2 is only used for ice thickness measurements. The retrieval process for obtaining ice thickness from these measurements also depends heavily on having accurate snow depth information, which is not well observed currently by any instruments and therefore the ice thickness measurements can have very high levels of uncertainty. I think that for this study to be relevant, a more realistic observing network must be used for all observation types (and also realistic values for observation uncertainties).
Line 136: The acronyms such as AICEN, VICEN, VSNON, SNWD, etc. are very non-intuitive and difficult to remember. These look like FORTRAN variable names, which are not necessarily appropriate for a scientific paper. Please consider variable names or non-acronym labels that readers will more easily remember and recognize. For example, if a quantity is a function of the thickness category, then this could be represented by a subscript of a variable corresponding to the thickness category index.
Line 226: “…not assimilating SIC observations improves most forecast metrics…”: A more realistic (i.e. much denser) observing network for SIC would likely lead to more improvement when assimilating SIC since it would also lead to less ensemble spread and therefore reduced non-Gaussian effects. Also, as already mentioned, it's not clear if the observation error standard deviation for SIC observations is state dependent and, if so, if this could cause some of the resulting negative bias since low SIC observations will obtain more weight than high SIC observations.
Citation: https://doi.org/10.5194/egusphere-2023-96-RC1 - AC1: 'Reply on RC1', Christopher Riedel, 31 Jul 2023
-
RC2: 'Comment on egusphere-2023-96', Anonymous Referee #2, 29 May 2023
General comments: This paper evaluates the effect of assimilating different sea ice variables and using different approaches ranging from post processing to ensemble Kalman filter to rank histogram filter. All experiments are performed in an OSSE framework so that truth is known and the observation's or assimilation method's impact can be properly separated and evaluated. This is a novel attempt at a very challenging sea ice DA problem. I think there is potential for this paper to resolve some sea ice DA performance issues and contribute to future prediction systems. Unfortunately, I found many issues with experiment design, including the particular choice of observation error that might have led to the negative results in assimilating concentration. The discussion about using EnKF and RHF for assimilation is also quite counter-intuitive. I feel there is a missed opportunity here to make a nice proof-of-concept demonstration of assimilating bounded concentration variables using the more suitable non-Gaussian methods. But I am surprised to find that RHF actually "degrades" the performance. I would encourage the authors to rethink the experiments results, even giving it another attempt with my suggestions below (if time allows). There are also quite many minor language issues (grammar errors and unclear expressions) in the manuscript, and I tried to list them here too. I suggest the authors to proofread more carefully in the next revision.
Major Issues:
1. The biggest issue in the experiment design is the choice of "15% error for the true values of SIC", Line 133. Is this the right error model for SIC observations? SIC=0 and SIC=1 are two scenarios both with high accuracy in real data. Far away from the marginal ice zone, the observation and model should have good agreement on SIC and there is little need for any update. To reflect these scenarios, ensemble spread in SIC should be near zero for all the members have similar SIC values in those grid points, and for observation the error should also be near zero. However, 15% error means SIC=0 has zero error but SIC=1 has maximum error of 0.15. You mean SIC=0.99 observation can have an uncertainty bar of +-0.15?? This is certainly not realistic. In fact, I suspect that this specified observation error is causing all the problems in the following results (how assimilation of SIC degrade forecast, causing negative bias inside ice cover).
2. In reality, the model simulated SIC has most uncertainties in the marginal ice zone, the concentration elsewhere should be quite accurate. What's more uncertain is the thickness of ice and snow, and it is especially difficult to uncover the snow information from real satellite observations. Because altimetry will only measure the total freeboard, which is how much ice stick out of water level, it includes both ice thickness and snow depth. Other technique can infer snow depth information from heat signatures, but the observation has large uncertainties with a lot of assumptions made in the inference. Therefore a 0.1 m SIT error and 10% error for snow depth might be too optimistic. Thin ice is also not well represented in the model (given only category 1 is below 1 m), I'm not sure if uncertainty should be the same for thin (young) and thick ice. I suggest revisiting these uncertainties and checking against typical operational numbers.
3. The definition of sea ice model and observed variables are quite confusing. There are "concentration" and "thickness", but later also "area" and "volume". My understanding is that SIC is equal to the integrated AICEN. Since volume = area x thickness, VICEN is actually not an independent variable from AICEN, but they are listed as separate variables in Table 1. The observation-state relation should be formulated in equations to be clear: how to compute SIT from AICEN (it's an average over 5 categories?) and SNWD is also confusing as how it relates to the model VSNON. There are at some point interchangeable use of the term "area" and "concentration", along with AICE and SIC. I suggest to define the state variables using easier-to-read math formula, such as A_{ice, n} for AICEN and h_{ice,n} for the thickness categories, then for observed quantities you can give equations such as SIC = \Sum_{n=1-5} A_{ice,n}
Also, for the experiment names, it gets difficult to follow after the first three, you also keep repeating what each experiment does in the results section too. Can you provide more intuitive names, such as "Control", "EAKF_ConcThick", "Add_SnowD", and "RHF_ConcThick"?
4. Metric for forecast verification is chosen to be the total sea ice area and volume. These are integrated variables, so that local errors of positive and negative biases might get cancelled even before the error diagnostics. As you are performing an OSSE and you can evaluate against the truth member, why don't you choose a metric more suitable for the full 2D fields? You can even summarize errors separately for different areas (inside ice cover and near ice edge).
At some point I was wondering if the ensemble spread is representing the correct amount of forecast errors for the ensemble mean versus the truth. It seems a measure for the skill of the ensemble is still missing in the current diagnostics. For sea ice, there are also a metric similar to the CRPS, called Spatial Probability Score (Goessling and Jung 2017), that you can try to add.
Minor Issues:
Line 45: "Common sea ice descriptive quantities": Here you introduced concentration and thickness, but in the paper you are dealing a lot with area, volume and with respect to multi-thickness-category integrals. Can you provide some connection early on?
Line 65: "... if there is an optimal data assimilation setup...": I would not use the word "optimal" here, since it is unclear whether the model or the observations are near the optimal condition to provide information about the true sea ice states.
Line 77: "...follow closely to the optimal settings Zhang et al...": Did you use exactly their setting or did you changed anything? Please describe more clearly. Again, I would refer the "optimal setting" as "setting that performed the best".
Line 81: "Kalman 1960": This is the reference for the Kalman filter, you should cite the EnKF reference (Burgers, or Evensen, or Carrassi's review, or even Houtekamer and Zhang review paper).
Line 83: "...accurate estimate...": "best estimate" or "best guess" will be better here.
Line 84: "flow-dependent background-error covariance": You should point out that it is the ensemble that allows the estimation of a flow-dependent covariance, while the older methods such as "optimal interpolation" or 3DVar only employs static covariance.
Line 86: "While the RHF...", check grammar
Line 87: "however, that application is not applied in this study": Well, if not then you shall not mention this in the main methodology, you can discuss if this application is worth trying in future studies in the discussion, if that's what you are implying.
Line 88: "horizontal localization": a more accurate term is "covariance localization", then you can comment that you only localize in the horizontal directions.
Line 91: "inflating the prior background fields": its not the full fields that are directly inflated, rather the "ensemble perturbations" are inflated.
Line 97: "sea ice thickness, which is represented by the product of sea ice volume and sea ice area": this is wrong...sea ice volume = area x thickness.
Line 99: "five categories with lower bounds of...": Why are these exact numbers chosen? Are you following some classic CICE configurations? I've seen other models using categories that are more focusing on thinner ice (0.1, 0.3, 0.7, 1.0, then 3.0), your values here seems to lean on the thicker categories.
Line 100: "provides an unique challenge": "pose a challenge" will be more suiting.
Line 110: "non-Gaussian impacts": could you be more clear?
Line 113: "three different parameters were perturbed that impact...": I would discuss the three parameters in separate sentences and be more clear. Please also list the range of values for the perturbation (one standard deviation around the true value), so that people can try to reproduce the experiments.
Line 114: "ocean-ice drag coefficient": The ice motion is mostly driven by atmospheric surface winds, why did you perturb the ocean drag coefficient instead? Wouldn't it be more responsive if you perturb the air-ice drag coefficients so that surface wind uncertainties can also be partially included in the ensemble perturbations?
Line 116: "equilibrium state"? The model will experience seasonal cycles with the periodic forcings, how can it reach an equilibrium? I think you meant to spinup the model to reach the correct energetics, climatology?, or seasonal variability.
Line 121: "negatively biased": Are all sea ice and snow variables negatively biased? Even for each thickness categories?
Line 130: "sea ice and snow quantities have both ...bounds in their representation": "both" sounds weird here, how can a variable have both single and double bounds?
Line 135: "10-second CryoSat-2": can you provide more details? Did you use 10 seconds worth of data amount from CryoSat-2?
Line 136: "AICEN, VICEN, and VSNON": I don't think you have defined these names yet.
Line 141: "update the sea ice area in different categories": sea ice "area" and "concentration" seems to be used interchangeably, but it is confusing for the readers outside of sea ice community.
Line 142: "allow sea ice conc. and volume to be updated": are the AICEN and VICEN two independent variables? I thought that VICEN = AICEN x hicen, and hicen is given (the five categories), so if you update concentration (AICEN) you already have the VICEN updated. Could you clarify?
Line 145: In eq. 1, hsnon is defined as "category-based snow thickness". Are they given by the same 5 categories for sea ice thickness? It is not capital letter, which means they are not prognostic variables? How does this even work? Do you have hsnon scale with hicen (ice thickness categories)?
Line 147: for experiment 2, you are missing another important clarification here. Does all observation types (SIC, SIT and SNWD) update all model state variables (VICEN, AICEN, VSNON)? It maybe by default in DART that all cross-variable updates are kept, but for other systems there are often ad-hoc cross-variable localization. For example, not allowing SIT and SNOWD to change AICEN...so please carefully provide details. Table 1 doesn't have these information either, you need to state here more explicitly.
Line 150: "non-parametric data assimilation": sounds like the new method for DA doesn't have parameters to tune...the non-parametric refer to the underlying distribution function. You should just say "non-Gaussian data assimilation"
Eqn. 2: this is not an equation, but a pseudo code. You can write AICEN \leftarrow AICEN * 1 / SIC.
Line 160-169: here you list three separate treatments for (1) SIC > 1, (2) SIC within bounds, but AICEN < 0, and (3) AICEN > 0 but VICEN = 0. In your code, what is the order for applying (1)(2)(3)? Does the order matter? For example, if I first get rid of negatives (2) then squeeze (1)? I'm not sure if the order matters.
Line 168: "mid-point sea ice thickness": It's not easy to understand that here you mean the thickness value for each category (bin), if you provide a SIT distribution (historgram over the 5 categories), before and after the three treatments, it would be much clearer.
Line 175: "Total sea ice area, volume, snow volume": What's their connection to the model or observation variables? Can you provide equations for how you computed these integrated values?
Line 179: "Total sea ice area will allow for better evaluation...": The reason is not clear to me.
Line 182: "Mean absolute bias and mean square error": Are they computed for the spatially-integrated values (total sea ice area, etc.) then averaged over time? Or are they also computed for the model states on each grid point, then taking both spatial and temporal averages? Again, equations may help a lot here for clarity.
Line 190: "over-prediction, under-prediction": Could you define these please?
Line 198: "In experiments 1-3 investigate...": check grammar
Line 208: "the biases are reduced for total snow volume throughout": not quite reduced for winter seasons before Jun, what happened?
Line 214: "non-Gaussian component... could be the driving factor": from results of RHF I don't think the degradation in experiments 1 and 2 is due to non-Gaussianity. If so the RHF should have improved forecasts, but it in fact degrades even more. You should look for further reasons, such as ensemble spread compounded with the value bound, or strong seasonal cycles posing a large bias in the prior mean.
Line 220: "removing the SIC observations provided a more accurate forecast": This might also suggest that the DA method is not causing the degradation, but rather the data themselves.
Figure 4: It seems most of the MSE is due to bias? MAB is almost equal to MSE. So maybe showing MAB is enough to tell the story.
Line 234: "total sea ice area ... performed poorly": based on your definition the SIC integral will include the negative biases, but if in operation people use the 15% concentration threshold to count for sea ice area (or extent) then this negative bias will not be a problem.
Line 244: "analysis increment (AI)": I would avoid using acronym that is already well-established.
Line 273: "drift away from true value towards the observed value": looks like the filter is putting too much weight on observation in this case? Does the truncation or the bound at 1 caused some error in observation uncertainty representation?
Line 286: "Applying a non-Gaussian ... can lead to erroneous observation impacts": If this is the reason for low biased analysis here, then applying RHF will reduce the bias? But you see the RHF actually increase the low bias in experiment 4... This is contradictory.
Line 293: "near the sea ice margin": the area is called "marginal ice zone (MIZ)"
Line 298 and 329 and 336: "The use of non-parametric RHF did not handle the SIC obs better": again, the issue is not in handling the non-Gaussian distribution, but the observation error model is the culprit here. The true Bayesian solution might be low-biased SIC (given the observation), and RHF is actually doing a better job in fitting the observation, hence the "worse" low bias.
Line 353: "forcing file for the truth member is an outlier": This is related to how you generated the perturbed members, if the forcing spread is large enough so that ensemble spread covers this truth member, then I would expect the filter to still work. Given here that the forcing has a near linear relation with the sea ice and snow states.
Citation: https://doi.org/10.5194/egusphere-2023-96-RC2 - AC2: 'Reply on RC2', Christopher Riedel, 31 Jul 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
418 | 211 | 37 | 666 | 27 | 25 |
- HTML: 418
- PDF: 211
- XML: 37
- Total: 666
- BibTeX: 27
- EndNote: 25
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
1 citations as recorded by crossref.
Christopher Riedel
Jeffrey Anderson
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(17519 KB) - Metadata XML