the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Evaluating Arctic Sea-Ice and Snow Thickness: A Proxy-Based Comparison of MOSAiC Data with CMIP6 Simulations
Abstract. The Arctic sea-ice cover and thickness have rapidly declined in the recent past. Snow cover on sea ice, acting as an insulating barrier, was shown to be instrumental in driving the variability and trends in sea-ice thickness. Because of this, the ability of climate models to realistically simulate the present-day annual cycles of Arctic sea-ice properties has become a central measure of model performance in Arctic-focused climate model intercomparisons. However, evaluating free-running model simulations usually requires multi-year observational datasets, which is challenged by the relatively short-term existing Arctic measurements particularly sea-ice and snow thickness. In this exploratory study, we propose a new methodology to improve the meaningfulness of sea ice and snow comparisons to model data. We make use of the exceptional year-long MOSAiC observations to examine the simulated Arctic sea-ice and snow thickness in 10 CMIP6 models. To perform meaningful comparisons with the modeled simulations, we define two “proxy years” selection methods based on sea-ice area and atmospheric criteria, when these conditions in the Arctic are similar to those during the MOSAiC year. We verify the capability of the proxy-year composites to capture the atmospheric and sea-ice variability, by comparing them with the sets of nudged simulations in which the atmospheric circulation observed during the MOSAiC year is directly imposed. Our results show that models tend to simulate similar annual cycles compared to the observations however, with an overestimation in amplitude for snow thickness and a misaligned phase of sea-ice thickness cycles. Overall, the study highlights that regardless of the specific modeled configurations and conditions within individual proxy years, biases in sea-ice and snow thickness remain consistent, even when wind conditions are imposed in the nudged model simulations. This highlights the necessity for a better representation of modeled processes driving the sea-ice and snow thicknesses which will be instrumental in the next generation of GCMs. This first MOSAiC-based assessment of the modeled snow and ice thickness, and the proposed proxy-year-based methodology, pave the way for further meaningful model evaluation.
- Preprint
(1617 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2214', Anonymous Referee #1, 02 Oct 2024
This study attempts to evaluate the accuracy of CMIP6 climate models in simulating Arctic sea-ice and snow thickness, comparing their results to year-long MOSAiC observations. It introduces the concept of "proxy years" based on sea-ice area and atmospheric conditions to align model data with the single-year MOSAiC observations (Snow Buoys/IMBs).
I think the general concept of how can we use a single year of observations to evaluate climate models is worth exploring. But I struggled with the methodology and am not at all confident that the conclusions about model “biases” are valid. There are large scaling issues and significant natural climate variability at play that I don’t think have been appropriately accounted for in this comparison. The main objective is to “examine whether discrepancies in the proxy-based ice and snow cycles arise from mismatching weather conditions or from insufficient process representations” but I am not convinced you have demonstrated that and I don’t think we can use this analysis to really understand CMIP6 model biases.
I think the nudged simulations offer the most enticing part of the current manuscript, so I would encourage the authors to focus more of the effort and write-up on that.
General Comments:
L273: “averaging over a large number of observations contributes towards making it comparable.”
This is quite a simplistic way of dealing with the considerable scaling issues. It’s not just that the data represents different resolutions, it’s also about how much a sea ice model should match with a relatively sparse and localized series of observations like the ones collected from MOSAiC in either the proxy analysis or the nudged simulation analysis.L293: “For an accurate and comprehensive selection of proxy years with characteristics similar to the MOSAiC year, it is crucial to eliminate divergences from observations which arise from the free-running models’ different realizations of natural variability. Therefore, our method refines the selection process by excluding conditions (or years) vastly different from those observed during MOSAiC, ensuring the chosen years mirror the sea-ice and atmospheric conditions of the study period.”
I do not consider this approach to be the right one. I think it would be more helpful to really assess how the MOSAiC year compared with the full model ensemble spread, how typical was it etc, and to provide more insight into the probability of a model sea ice state agreeing with the sea ice observations considering the atmospheric variability across the model ensemble.The nudged simulations I think are maybe the best idea here, but even with that you don’t really compare before and after nudging, so it’s hard to know how significant the nudging is compared to other factors. ECHAM6/FESOM contributed to CMIP6 but I don’t think this is sufficient to be described as the non-nudged version? And you don’t show that anyway.
There are a lot of subjective statements in the methods about the benefit of the approach, I think it is best to stick to the methods description and let the reader decide on how suitable they are!
The paper, especially the introduction needs help with the writing, lots of language issues and a lack of relevant citations and much in the way of literature review.
Specific comments:
L212: “The SIT values used throughout the study for all the climate models, are weighted by the “siconc”
I think you need to check this as the ice thickness should be the ice thickness where you have ice and shouldn’t need to be weighted by ice concentration, you are maybe mixing that up with the ice volume?L270: “Since GCMs provide a single mean value of SIT and snow thickness for each grid cell…”
Actually some CMIP6 models and model runs like CESM do include the full ITD output!
And again, just averaging everything doesn’t ensure they are comparable quantities.Citation: https://doi.org/10.5194/egusphere-2024-2214-RC1 -
RC2: 'Comment on egusphere-2024-2214', Anonymous Referee #2, 08 Nov 2024
This article presents a method for model-observation comparisons using a proxy year method, in the context of comparing observations of sea ice and snow thickness to in situ observations from the MOSAiC field campaign. Two criteria for the proxy year selection are considered: selecting years based on agreement in sea ice area, and in atmospheric conditions (in particular, the strength of the Arctic Oscillation). The snow depth and ice thickness in proxy years are compared to model simulations nudged to atmospheric conditions, and also to a bootstrap (Monte Carlo) distribution of randomly-selected years to examine the performance of proxy years compared to randomized years. Proxy years selected using the sea ice area criterion perform comparably to nudged model simulations for sea ice thickness and agree better with observations than randomly-selected values. However, regardless of nudging or proxy year selection, the models have difficulty representing the evolution of observed snow thickness.
I think the development of the proxy-year method is reasonably motivated; I agree with the authors that it would be helpful to have less resource-intensive approaches (relative to running nudged simulations) available. However, I think this article needs a number of major revisions and clarifications for the proxy-years method to be more replicable, and for more clarity on the applicability of this method, and in particular, on its limitations. I will detail my general and minor comments below.
General comments
Although the article tests two proxy selection criteria (SIA and AO) against two observed quantities (SIT and snow thickness), from what I see in the results (Fig 8), only the SIA proxy performs better than random selection, and only for SIT. The AO proxy, conversely, does not perform better than random selection, and snow thickness is not well-estimated overall. The authors do state this in the article results, but I think there are several instances in the discussion and conclusions where this is not stated clearly enough, in my view. I will identify some specific examples of this in the minor comments below, but overall, I think more care needs to be taken throughout the article to explicitly state which approaches are applicable for which situations, to clearly state where methods and models did not perform well, and to explicitly state which metrics are being used to evaluate performance. I think the article would benefit from some more emphasis on quantitative assessment of the performance of the method (e.g. quantify more explicitly the agreement with observations).
I would appreciate more background motivation on the choice of the proxy year method as a method for this analysis. To my recollection, similar methods have been used in paleoclimate research; including some references there would be helpful, or some references to similar methods to the proxy year method, if possible.
Since this is an article proposing a method presumably for use in other research, I think it is crucial that the method be documented in sufficient detail for it to be reproduced. In its current state, the article is somewhat lacking in detail necessary for reproducibility and would benefit from some clarifying revisions. For example, for the SIA proxy, estimates of contributions from first-year and multi-year ice are discussed, but it is unclear to me how the total seasonal difference in Fig 2b was calculated from these contributions. Is this the sum of the two differences in Fig 2a? Also, was the absolute difference used, or some other difference metric? Some additional details would be helpful here. Likewise, I would appreciate similar clarification for the AO proxy. In general, is also not always clear to me when authors discuss result significance and inter-model differences, how these differences are being determined and compared.
This article emphasizes that the proxy-year method addresses three challenges for model-observation comparisons; spatial sampling differences, measurement locations shifting in time, and the fact that GCMs are not designed to simulate specific years. However, in my opinion, the approaches used to address the first two challenges are not novel. As such, I do not think there needs to be as much emphasis placed in the article on averaging a large number of observations or collocating measurements to model grid cells, since both of these are commonly-used approaches for model-observation comparisons. I do think the use of these approaches is worth mentioning, but I do not think they need as much emphasis as the use of the proxy-year method.
I understand that models represent a range of possibilities. However, I wonder about the representativeness of historical years being chosen as proxies for the MOSAiC observation time year, which took place after the historical period being observed. Given that there have been recent discussions of regime shifts in Arctic sea ice (e.g. Sumata et al., 2023), I wonder about the applicability of proxy years from earlier decades to more recent years, and what biases could be potentially introduced due to interdecadal differences. I would appreciate if the authors would comment on this, and other possible confounding factors. Given the caveats associated with this method, I think the authors need to be more careful when making conclusions about overall model process representativeness based on proxy years.
Finally, I find the flow and organization of the article to be confusing at times, and some parts of the discussion appear to be outside of the intended scope of the article. Some points are also raised suddenly without being mentioned earlier in the article. For example, methodology validation is not mentioned until Section 3.4, when the Monte Carlo method is suddenly introduced. I would find a mention of this in the introduction to be helpful. The last paragraph of the introduction could potentially be expanded with slightly more detail about the contents of the sections. I will include some mentions of where detail could be added in the comments below. I would encourage the authors to consider if parts of the article could also be condensed or restructured for clarity of scope as well.
Minor comments:
161: You introduce the nudging here but not why it is used; I suggest briefly mentioning the purpose of the nudging here, because otherwise it seems unrelated to the proxy year method, and “another set of comparisons” is vague phrasing (comparisons to what?)
208: I’m guessing the period 1979-2014 was chosen because the historical runs are limited to up to 2014, but maybe specify that in the text. Why were the historical runs chosen over other years? Could you comment on the potential utility of other scenarios? My comment below on line 220 is related to this also.
220-223: I have some questions as to how this decision was made, since it does seem like using the SSP585 scenario may be adding biased results. Which observed sea ice characteristics were not reached? Some clarification would be helpful.
249: Previous nudging studies are cited, but I think it would be helpful for clarity to explicitly state in the article that the nudging methods used here are motivated by previous studies.
Figure 2: The colourbar in b) denotes the scale for the selected years, but I would also be curious to know the scale for the greyed-out values. Consider possibly including a colourbar for the greyscale values also, or highlighting the selected years in another way. In the caption, “highlighted” is also somewhat ambiguous, perhaps replace this phrasing with “in colour”.
Figure 7: I realize you want to show daily variations here, but it would be nice to have monthly means of these plots available to be able to more directly compare with Fig 5 and 6.
545: Would it be possible to show the monthly accumulated precipitation/snowfall values in a supplemental figure? I am curious to see how closely they agree, given that precipitation is challenging to represent in models. This would also help illustrate how large the 0.005 m quantity is relative to the amount of precipitation.
561: I think more detail is needed on how the Monte Carlo samples were generated and selected. E.g. were all the models pooled together for these 10,000 samples?
574-575: It may just be the phrasing confusing me here, but when you say the method “captures real-world scenarios, albeit rare ones”, does that imply that more typical scenarios may be more difficult to capture?
605-607: “Both methods account for the observed spatio-temporal variables […] ensuring a closer approximation to the sea-ice and atmospheric conditions during the study period.” I think this sentence may need to be rephrased, since in this article, you show that only the SIA criterion improves agreement with sea ice, with the AO criteria lying close to the bootstrap mean (as stated in lines 565-566)
610-612: “Our two proxy year selection methods demonstrate performance comparable to that of atmospherically nudged simulations”; I think you need to be careful here because the AO method does not appear to be as comparable to nudged simulations as the SIA method. By which metric are you defining “comparable” performance here?
624: Given that snowfall is brought up as a point of discussion here, I think it would be helpful to include it in a supplemental figure.
735: Annual maximum and variations of which specific quantities? Please clarify.
742: I think you should clarify here that you saw modest enhancements in aligning SIT with observed annual cycles, unless you’re referring to other quantities as well, in which case you should specify for clarity.
Minor edits by line
159: “a comparison of MOSAiC dataset” should be “the MOSAiC dataset”
220: “ensemble.” -> “ensemble member.”
432: less -> fewer
435: fewer: consider instead “lower” or “lesser”
543: Fig 5b refers to SIT, did you mean Figures 6a and 6b for snow thickness?
748: hereinabove -> above
References:
Sumata, H., de Steur, L., Divine, D.V. et al. Regime shift in Arctic Ocean sea ice thickness. Nature 615, 443–449 (2023). https://doi.org/10.1038/s41586-022-05686-x
Citation: https://doi.org/10.5194/egusphere-2024-2214-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
307 | 48 | 108 | 463 | 5 | 6 |
- HTML: 307
- PDF: 48
- XML: 108
- Total: 463
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1