the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Catalogue of Strong Nonlinear Surprises in ocean, sea-ice, and atmospheric variables in CMIP6
Abstract. The Coupled Model Intercomparison Project Phase 6 (CMIP6) archive was analysed for the occurrence of Strong Nonlinear Surprises (SNS) in future climate-change projections. To this end, we built an automated detection algorithm to identify SNS in a reproducible manner. Two different types of SNS were defined: abrupt changes measured over decadal timescales and slower state transitions, too large to be explained by the forcing without invoking strong internal feedbacks in the climate system. Data of 54 models were analysed for five shared socio-economic pathways for ocean, sea ice, and atmospheric variables. The algorithm isolates regions of at least 106 km2 and utilizes stringent criteria to select SNS. In total 73 SNS were found, divided in 11 categories of which 4 apply to abrupt change and 7 to state transitions. Of the identified SNS 45 % relate to sea-ice cover, 19 % to ocean currents, 29 % to mixed layer depth, and 7 % to atmospheric systems like the Intertropical Convergence Zone. For each category, probability density functions for time-windows of maximal change indicate SNS occurring earlier and at lower global temperature rise than assessed in previous reviews, in particular the ones associated with winter Arctic Sea ice disappearance, northern North Atlantic winter mixed layer collapse and subsequent transition of the Atlantic Meridional Overturning Circulation (AMOC) to a weak state in which the cell associated with North Atlantic Deep Water involved has vanished. This catalogue emphasizes the possibility of SNS already below 2 °C of global warming, even more than the previous assessments based on CMIP5 data.
- Preprint
(13625 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 21 Sep 2025)
-
RC1: 'Comment on egusphere-2025-2039', Anonymous Referee #1, 12 Jul 2025
reply
General comments
This manuscript describes a catalogue of Strong Nonlinear Surprises (SNS) in ocean, sea ice and atmospheric variables in CMIP6. The authors expanded on the methodology of a previous assessment on CMIP5 by Drijfhout et al. (2015) by automating the detection of SNS and including an algorithm to combine grid cells into spatially connected regions with SNS. They have a set of 6 categories of SNS, including abrupt changes and state transitions.
The developed method substantially improves the previous method, specifically by automating and including a spatial algorithm. The algorithm performs very well, and the authors are able to successfully capture large SNS in the data. The results are of great interest and highly valuable to the community. The results lead to new insights and have a high potential to stimulate further research and discussion within the field on abrupt dynamics in the climate system. The manuscript could benefit from a clearer description of the methods, careful framing of the results, and a reorganized and more substantive discussion.
Major comments
1.
Could the authors please clarify in the methods section how exactly the regions are determined. Specific points to consider here are the following.
Thresholding is used to select different regions. How are these initial regions created/selected? Is the percentage threshold based on the very first and last values of the timeseries or over a smoothed timeseries/average over n years? This could make a difference for variables with high variability. For the third region finding approach, what is the reasoning for multiplying the percentage scores? In the third phase, formal criteria are applied to the selected regions. Does this merge regions of the same region-finding method, or does it merge any regions regardless of the region-finding approach? If so, will this lead to “smoothing” out of SNS events? Also, what is the point of having higher thresholds in case the different types of regions are merged? Why does it not work to only use the lowest threshold of T = 85%?
2.
It is not fully clear what choices the authors made in arriving at the 6 different SNS categories and how to interpret them, i.e. in what ways are they similar or different. What is, for example, the difference in interpretation between categories i and ii? Type ii is towards the end of the time series, and it can therefore be less robustly tested whether the change is persistent. Should this then be interpreted differently from a “real” abrupt change event i? Please give a short explanation of what the authors regard as a state transition/new state (criteria iv to vi).
In addition, the manuscript would benefit from more robust reasoning for the different categories and differentiation between abrupt changes and state transitions. Categories iii to vi are concerned with state transitions instead of abrupt shifts. However, when looking at the detected time series, the SNS often seem abrupt (e.g. sea ice “A” and “a” both change abruptly relative to the timescale of their normal dynamics). What is the motivation for separating these? With regards to the criteria of category iii, can the authors explain why they decided on this criterion instead of using vi with an extra requirement of a minimum surface area? Currently, the results sections are divided into abrupt shifts and state transitions for the same systems. Without a clear reasoning on the difference between the two, perhaps the authors can merge the sections for each physical system instead of having this distinction.
3.
Throughout the sections discussing the SNS results, the authors make statements about the mechanisms or forcings of the identified SNS without discussing how they arrived at this conclusion. Can the authors please substantiate the claims they make on this, whether it is based on analyzing the data of multiple variables at the SNS or on literature. We suggest that claims like “forced by”, “caused by”, “leads to”, “driven by” need to be backed up with either references or a note on what is observed in related variables around the SNS.
An (incomplete) list of points where this was done is shown below, and the manuscript would benefit from a thorough check on the whole results section on whether the claims are substantiated.
- Lines 160-162. What is the source for the fact that sea ice loss is caused by the sea ice-albedo feedback in these simulations?
- Lines 205-208. In what way does it lead to a new climate/what type of climate?
- Line 213: "forced by global warming”
- Line 217: Why is this likely driven by the onset of deep convection? Similar question for the explanation around line 220, does the onset of deep convection show in the data? Is it clear what process causes what?
- Line 230: Can the authors show this mechanism or have a reference? If not, mention that this is a proposed mechanism. What is meant by the last sentence of this paragraph?
- Line 288: “the mixed layer collapse is caused by a polar halocline”
- Line 295. Why is freshening a requisite for mixed layer collapse to occur? For example, in Figure 8 it looks like the mixed layer decreases slightly before the freshening starts.
4.
It would be good if the computation of the CDFs was added to the methods section, instead of only being explained and discussed in the discussion. The results of the global CDFs could then be placed near the end of the results section. This would improve the structure and readability a lot.
Figures 16 and 17 are informative showing the distributions of global warming at which the SNS occurred. In the second panel of Figure 16, it shows that there are very few simulations above 6 degrees of warming. The authors currently use a cut-off of 11 degrees, but maybe this should be lowered to 6 degrees. The high temperature region draws a lot of attention while not being informative due to the very high uncertainty. Moreover, the color palette puts a very strong focus on the SSP585 scenario due to the bright color. In Figure 17, the CDFs of all categories are shown. However, some categories contain just one model simulation. This makes the CDF highly uncertain. Maybe only the CDFs with more than e.g. 5 detected SNS could be shown or separate those with more simulation by a different line style.
Furthermore, in the introduction (line 62), it is stated that PDFs are used to give the likelihood of maximum change. Can the authors explain or provide a reference to why a single simulation can statistically give a likelihood? In Figure 3, the global warming level at the point of maximum change in the SNS are used instead of PDFs. What is the reasoning for not using the same method in both cases? For Figure 3, one could take the global warming level at e.g. the midpoint of the PDF instead.
5.
The discussion contains important points, but it requires an improved structure. A large part of the discussion section is currently occupied with methodology and new results (how to obtain the CDFs and the global warming levels). The manuscript would benefit from moving this part to the methodology and results sections (as described in major comment 4).
The discussion is currently very brief (when the CDF results are not considered). It would be good if the authors linked back to some of the points they mention in the introduction, like the distinction between abrupt shifts and tipping points, and how their results fit into this. In addition, it would be valuable if some discussion on the individual physical subsystems was added, placing their results in the wider literature context including a discussion on future research directions for specific systems.
Lastly, the main conclusion the authors draw is that the number of SNS events rises until a global warming level of 6 is reached where it stabilizes, even though few simulations reach such high temperatures. It is a little unclear how to interpret this rise since there are less simulations. We would recommend rephrasing this conclusion such that it is better supported by the previous discussion.
Specific comments
The authors clearly describe how they distinguish between the terms “abrupt changes/shifts” and “SNS”. They also include a short discussion about the potential harmful effects of the tipping points concept. The strength of the statements regarding the tipping point controversy does not reflect the content of the paper. We believe it is important to discuss the distinction between tipping points and abrupt shifts, but the way it is currently framed might distract from the goal of the paper.
The authors mention in the introduction that they search for events that are “truly surprising”. What is this exactly according to the authors?
It is generally well-argued in the introduction that the goal is to detect large and abrupt changes in the data. However, the authors also argue they want to limit the total detected amount of these events. What is the reasoning behind that? Instead of being guided by the quantity of SNS in the data, the goal now seems to only find the largest changes instead of all large events. This needs more justification.
In the introduction, the authors reference the use of machine learning (line 52). How did the authors make use of machine learning? In the methods section, there is no reference to a machine learning method.
The paragraph at line 85 lists all scanned variables. Why is only one atmospheric variable used? In the introduction it seems that atmospheric variables are also a focus point (also at line 500), which does not come back strongly in the rest of the manuscript.
Why do the authors combine historical data with SSP scenarios? Is this to gather enough statistics for e.g. the Diptest? If so, please mention this in the methods.
Why is global warming calculated with respect to the average temperature from 1850-1880 instead of the preindustrial temperatures from preindustrial control simulations?
Around line 95 the authors mention they only look at yearly averages (except for mixed-layer depth). Some other variables, like sea ice extent, also depend heavily on the season. Summer and winter sea ice likely disappears at different forcing levels. Could the authors analyze summer and winter sea ice separately? By averaging year-round, abrupt changes in summer sea ice are likely missed.
The category of each detected SNS is often denoted by a letter (either lowercase or uppercase). This is difficult to keep track of. We suggest writing it out instead throughout the whole manuscript since it is difficult to remember every category (e.g. “Abrupt shift in NH sea ice” instead of “category a”).
In figure 3, the abbreviations are not yet explained. The figure shows different locations for abrupt changes versus state transitions (for example, MLD and sea ice). Could this be clarified? Furthermore, the order in Fig 3 does not correspond to the order in which the systems are discussed in the results section. It would help to align this for clarity.
It is not clear what the difference is between section 3.1 and section 3.2. According to the formal criteria they are indeed divided into different categories, but are they really physically different from each other? When looking at the time series in Figure A3, the loss of sea ice is also abrupt, even though they are treated as state transitions instead of abrupt shifts. How big is the overlap between models in sections 3.1 and 3.2?
At line 199-200, the authors mention that the thresholds are reached earlier in CMIP6 with reference to Figure 5. However, this figure does not relate to this statement. It would be good to mention the new temperature range in CMIP6 for this comparison since this is not explicitly mentioned.
Line 258: Over what region is this temperature impact measured? Over the area where the mixed layer collapses?
Line 262: Please add an explicit reference with whom the authors agree.
3.4 and 3.5: In both sections 3.4 and 3.5, changes in the subpolar gyre are discussed. In the first paragraph of 3.4, the authors explain that they do not find any abrupt shifts in SPG convection, but later they do discuss such changes. This is confusing and requires clarification. Furthermore, the comparison with the results of Swingedouw et al. (2021) is framed in 3.4 as if the results do not match, while in 3.5 many of the same models are found to exhibit SNS, only as state transitions instead of abrupt changes (see also major comment 2). Because of the large differences between methods and definitions, we suggest that the authors do not make this statement as strongly. Especially regarding the large area threshold used, smaller scale abrupt shifts (on the order of e.g. the Labrador sea) cannot be found, making a precise comparison nearly impossible. Why did the authors not consider a smaller area threshold for this system, given the scale of the processes relevant for convection?
Line 291: What are these larger regions? How are these obtained?
Line 356: In the text, a comparison is made between this manuscript and other articles with reference to Figure 11. However, this figure does not contain any comparisons; this should be added to the figure.
Line 397: Can a reference be added?
Line 403-404. Why are the transitions associated with model bias? In what sense does the double ICTS become less pronounced?
In the first paragraph of the discussion, it is mentioned that there is a large increase in number of SNS between this assessment and Drijfhout et al. (2015). How is this statement supported? Both assessments used different methodologies and criteria. Using an automated algorithm could likely have increased the number of detected SNS. It would be interesting to see how many events would be detected if some of the CMIP5 variables were re-analyzed with the new methodology (although this would require substantial work and therefore is not a request to the authors).
Line 475: What is meant by the small bump? It is not clear where in the figure this is visible (there is however a small bump at 0 degrees?).
In the discussion, global warming thresholds are given for each category of SNS. Could this be summarized in a table?
Figure 17: What is meant by “maximally changing”? Why does the temperature of the bottom figure range from 0 to 5, while the upper one ranges from 0 to 17?
The authors mention at the end of the introduction that they will compare their results to the assessment of Terpstra et al. (2025). However, in the discussion they do not compare much of the results apart from stating that the CMIP6 thresholds are lower than in CMIP5. The authors could also make a comparison between frequency/thresholds between this manuscript and Terpstra et al. (2025). Even though both use different scenarios, and indeed one-on-one comparison is not possible, would it be possible to go into a bit more detail in the comparison?
Although not strictly necessary, it would be very interesting to have figures with both the time series and spatial extent (like e.g. figures 4, 5, 6) available for all SNS in an online supplement/repository if it does not require too much effort from the authors.
Technical corrections
Add consistent numbering format (e.g. line 324 “seven” and “3”)
Line 22: “Ref” should be the actual reference
Line 56: Remove extra brackets around citation
Line 68: Sentence not starting with a capital letter.
Lines 88-89: clarify what the difference is between msftyz and msftmz since now they both have the same full name (or state they are the same)
Line 96: maybe mention nominal resolution explicitly of Gaussian N90 grid for non-expert readers.
Line 99: TAS is written upper case here, but with lower case at line 90.
Line 131: “Generally, i and vi are generic criteria”. These are types/categories, not criteria.
Line 132 – 133: missing words in this sentence
Line 173: What does “Its similarity” point to? The abrupt change or abrupt shift in the previous sentence?
Line 188: This only occurs in one model, so remove “typically”
Line 206: The abbreviation of ppt is mentioned here but afterwards it is only used in the figures. Maybe this sentence can be removed.
Line 252: Suggestion: SSS decreasing the surface density à freshening
Line 254-256: Check the grammar of this sentence. What do the authors mean exactly by “in terms of atmospheric cooling”?
Line 266-267: NorESM2-MM and NorESM2-LM are mentioned in these two lines. Shouldn’t these both be NorESM2-MM?
Line 272: remove comma between number and unit.
Line 284: “unlikely whether” is not clear, maybe rephrase this sentence.
Line 291: “looking to” --> “looking at”
Lines 308-313: There is some repetition in mentioned locations of the transitions in different models.
Line 412: “…also work in Nature” --> “…also are present in nature”
Page 4, footnote 1: “that” --> “than”
In figure 12, the regions of SNS are shown for both SST and SSH. Is it correct that for both variables the regions are exactly the same?
Figure 16: Unit of degree Celsius is not displayed correctly in the pdf.
Citation: https://doi.org/10.5194/egusphere-2025-2039-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
431 | 63 | 14 | 508 | 13 | 23 |
- HTML: 431
- PDF: 63
- XML: 14
- Total: 508
- BibTeX: 13
- EndNote: 23
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1