Exploring Non-Gaussian Sea Ice Characteristics via Observing System Simulation Experiments
Abstract. The Arctic is warming at a faster rate compared to the globe on average, commonly referred to as Arctic amplification. Sea ice has been linked to Arctic amplification and gathered attention recently due to the decline in summer sea ice extent. Data assimilation (DA) is the act of combining observations with prior forecasts to obtain a more accurate model state. Sea ice poses a unique challenge for DA because sea ice variables have bounded distributions, leading to non-Gaussian distributions. The non-Gaussian nature violates Gaussian assumptions built into DA algorithms. This study configures different observing system simulated experiments (OSSEs) to find the optimal sea ice and snow observation subset for assimilation to produce the most accurate analyses and forecasts. Findings indicate that not assimilating sea ice concentration observations while assimilating snow depth observation produced the best sea ice and snow forecasts. A simplified DA experiment helped demonstrate that the DA solution is biased when assimilating sea ice concentration observations. The biased DA solution is related to the observation error distribution being a truncated normal distribution and the assumed observation likelihood is normal for the DA method. Additional OSSEs show that using a non-parametric DA method does not alleviate the non-Gaussian effects of the sea ice concentration observations, and assimilating sea ice surface temperatures have a positive impact on snow updates. Lastly, it is shown that perturbed sea ice model parameters, used to create additional ensemble spread in the free forecasts, lead to a year-long negative snow volume bias.
Christopher Riedel and Jeffrey Anderson
Status: open (until 11 Jun 2023)
- RC1: 'Comment on egusphere-2023-96', Anonymous Referee #1, 01 May 2023 reply
- RC2: 'Comment on egusphere-2023-96', Anonymous Referee #2, 29 May 2023 reply
Christopher Riedel and Jeffrey Anderson
Christopher Riedel and Jeffrey Anderson
Viewed (geographical distribution)
Review of “Exploring Non-Gaussian Sea Ice Characteristics via Observing System Simulation Experiments” by Christopher Riedel and Jeffrey Anderson
This paper describes a series of idealized data assimilation experiments using two different ensemble data assimilation approaches and simulated observations applied to a sea ice model. The issue of non-Gaussian distributions naturally appears due to the bounded nature of all of the important model variables. This is compounded by the fact that values for SIC near the extremes of 0 and 1, where the non-Gaussian effects are most acute, actually tend to dominate relative to values closer to 0.5. In general, I find the paper to be a valuable contribution to this field by using an idealized experimental setting to address some basic aspects of sea ice data assimilation. However, I also have several general and more specific concerns, as detailed below, which I feel need to be addressed before the final version of the paper is published. One of the more important concerns (detailed below) is that the observing network is highly unrealistic, especially for SIC observations, and I feel this may have a direct impact on the study’s conclusions that state the assimilation of SIC observations degrade the results.
At least from my perspective, it seems that some fundamental issues regarding non-Gaussian distributions, especially for SIC, should be considered and acknowledged in this study. I expect some of the apparent bias in the SIC scores presented in the paper may be a result of using the ensemble mean when computing the bias. This choice of statistic may not be appropriate when the distribution is heavily skewed, as would be the case for SIC when the truth is close to either of the extremes of 0 or 1 (which it very often is). I wonder if even a data assimilation procedure that perfectly handled the non-Gaussian aspects of the problem would still result in a biased ensemble mean, since the distributions would necessarily be skewed. A discussion on what a “perfect” data assimilation procedure (i.e. something closely approximating Bayes theorem) would produce would be very helpful for the reader. Also, I am interested to see if the ensemble median SIC is less biased than the ensemble mean and may provide a “better” measure than the mean for skewed distributions (though I admit the definition of “better” is not completely clear). This would not require additional experiments to be performed, just recomputing the bias scores by using the ensemble median, instead of the mean.
The previous paragraph concerning apparently biased distributions relates to a more general consideration of whether or not it makes any sense to evaluate a bias or using any other way of evaluating a single quantity, such as the mean, that has been extracted from the ensemble distribution. More general approaches, such as the continuous ranked probability score (CRPS), attempt to evaluate the accuracy of the entire ensemble distribution and not a single value obtained from the distribution. Please consider using such an approach for evaluating the resulting ensembles or explain why it hasn’t been done.
Line 65: “…if there is an optimal data assimilation setup…”: This claim seems much too general, since the optimal configuration will likely depend on many details of a particular application of DA to sea ice. One such detail is the observing network, especially as mentioned elsewhere with respect to the unrealistic spatial distribution of SIC observations. Please rephrase this here and elsewhere.
Line 83: “One unique aspect of the EAKF…”: This is misleading, since all variants of the ensemble Kalman filter use flow-dependent background-error covariances, not just the EAKF. Also, other data assimilation approaches, such as ensemble-variational approaches, use flow-dependent background-error covariances. Please rephrase.
Line 91: “…poor representation of model errors…”: Presumably in an OSSE context you can perfectly represent any model errors that you choose to include in the experimental setup. Therefore, if inflation is still needed in such a context, then it must be serving a different purpose than accounting for model error. Please give some explanation for what these purposes are.
Line 133: “…15% of the true values of SIC…”: It is really 15% of the true values of SIC? Meaning that open water and very low concentration values are perfectly observed (i.e. error standard deviation close to zero)? Or is the standard deviation simply 15% (and not dependent on the true SIC)?
Line 134: “The locations for all synthetic observation types…”: For all observation types? This is very unrealistic for SIC which is well observed almost everywhere by passive microwave satellite observations every day. CryoSat-2 is only used for ice thickness measurements. The retrieval process for obtaining ice thickness from these measurements also depends heavily on having accurate snow depth information, which is not well observed currently by any instruments and therefore the ice thickness measurements can have very high levels of uncertainty. I think that for this study to be relevant, a more realistic observing network must be used for all observation types (and also realistic values for observation uncertainties).
Line 136: The acronyms such as AICEN, VICEN, VSNON, SNWD, etc. are very non-intuitive and difficult to remember. These look like FORTRAN variable names, which are not necessarily appropriate for a scientific paper. Please consider variable names or non-acronym labels that readers will more easily remember and recognize. For example, if a quantity is a function of the thickness category, then this could be represented by a subscript of a variable corresponding to the thickness category index.
Line 226: “…not assimilating SIC observations improves most forecast metrics…”: A more realistic (i.e. much denser) observing network for SIC would likely lead to more improvement when assimilating SIC since it would also lead to less ensemble spread and therefore reduced non-Gaussian effects. Also, as already mentioned, it's not clear if the observation error standard deviation for SIC observations is state dependent and, if so, if this could cause some of the resulting negative bias since low SIC observations will obtain more weight than high SIC observations.