the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Inconclusive Early warning signals for Dansgaard-Oeschger events across Greenland ice cores
Abstract. The Dansgaard-Oeschger (DO) events of past glacial episodes provide an archetypical example of abrupt climate shifts and are discernible, for example, in oxygen isotope ratios from Greenland ice core records. The physical causes and mechanisms underlying these events are still subjects of ongoing debate. It has previously been hypothesised that DO events may be triggered by bifurcations of physical mechanisms operating at decadal time scales, as indicated by a significant number of early warning signals (EWS) in the high-frequency variability of records from the North Greenland Ice Core Project (NGRIP). Here, we re-evaluate the presence of EWS by employing indicators based on critical slowing down (CSD) and wavelet analysis and conduct a systematic methodological robustness test. Our findings reveal fewer significant EWS than previous studies, yet their numbers are significant for some of the indicators estimating changes in variability. Additionally, a comparison of different Greenland ice core records also shows significant numbers and consistency for these same EWS estimators preceding a small selection of events in records with high temporal resolution. While those indicators might represent a common climate background, we cannot rule out that signals specific to the different ice core locations are captured. Estimators of correlation times were found to be less consistent and did not provide significant numbers of EWS when considered on their own. Based on these inconclusive results it is not possible to constrain the physical mechanisms underlying the DO events. Instead, our results highlight the complexities and limitations of applying early warning signals to paleoclimate proxy data.
- Preprint
(3881 KB) - Metadata XML
-
Supplement
(6232 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-3567', Peter Ditlevsen, 26 Jan 2025
This paper presents a thorough analysis of Early Warning Signals (EWS) prior to the abrupt Dansgaard-Oeschger events observed in Greenland ice-core records. All the available deep records, GRIP, GISP2, NGRIP and NEEM are used for the analysis. EWS are changes in statistical properties of a time series indicating a bifurcation-induced transition (b-tipping), they will not appear prior to a noise-induced transition (n-tipping). The aim is thus to identify for each of 17 DO-events in the well-dated past 60kyr records which would be due to b-tipping and which would be due to n-tipping assuming a classical bistable dynamics. As the detailed dynamics of the transitions are largely unknown, the simplest assumption (Occam’s razor type of argument) is that of a saddle-node bifurcation in a system subject to noise. In such a system variance will, from the fluctuation-dissipation theorem, increase when approaching the bifurcation point, likewise will the autocorrelation increase. This is the phenomenon of critical slow down. For any other suggested scenarios for the transitions, different EWS could potentially be detected. Since the transitions documented in the paleoclimatic records have already happened, detected EWSs obviously play the roles of hindcasts rather than forecasts, thus the purpose of detecting EWSs is rather dynamical system identification.
A fair statistical significance test is constructed by booth-strapping through generation of so-called Truncated Fourier Transform Surrogates (TFTS), which is just surrogate timeseries constructed by randomly choosing phases (not shuffling) of the Fourier-coefficients while keeping the amplitudes of the original signal. “Truncated” refers to not changing phases of the long wavelength coefficients to preserve trends in the timeseries. Since the variance and the autocorrelation in a time series only depends on the amplitudes of the Fourier coefficients, the TFTS will have the same variance and autocorrelation as the original time series over the full glacial state (GS) period analyzed. The EWS indicators are now calculated within 200y running windows for each of the GS periods prior to the DO-transitions and the slope of the linear fit of this indicator time series is calculated and a significant slope (at the 95% confidence level) is identified from the distribution of slopes in the TFTS time series. From this analysis it is established that only a few DO-events are preceded by EWS, in agreement with the expectation that about one of the 17 DO events should be significant at the 95% confidence level, motivating the title of the paper.
The findings confirm our earlier findings (Ditlevsen and Johnsen, 2010), so in some sense this is a reporting of negative results. However, I find that the paper presents useful methods for this kind of analysis, thus I recommend publication. I do though recommend a revision for clarifications and better readability:
- The GS vary in duration, a typical GS lasts perhaps 2ky, which means that there are ten independent 200yr window measurements. Thus, the linear trend is made for only 10 points or maybe even less. Furthermore, for the 20yr resolution records, there are only ten points within a 200yr window, from which the EWS are calculated. A discussion of the uncertainty and the quality of the estimates is lacking.
- A consistency check between significant EWSs found for some, but not both EWS and some, but not all records (which are obviously false positives) and the number of false positives expected from the boot-strapping should be made.
- Figures 4-9 are difficult to read, unless they are only to be read like white-pink-red barcodes. Consider showing just one full width time series (for each EWS) and present the consistency between resolutions/methods/records in a figure similar to Figure S7 or S8, in order to get a better overview. There is a lot of information in the text, which makes reading difficult.
- Increase in the weights of specific wavelet coefficients and increases in local Hurst exponent, H, are suggested as EWS. However, it is not argued what the assumed underlying (complex) system exhibiting these EWSs are. There are references to earlier papers by the same authors (Rypdal, 2016, Boers, 2018), but these references do not provide such justifications. It is mentioned that the Hurst exponent is an estimate for the correlation, but if this is the only argument for calculating H, the authors should at least argue why it is more reliable to calculate H, than just analyze the autocorrelation (which is also done). It would substantially strengthen the paper if such arguments could be presented.
- A discussion of how a Hurst exponent can meaningfully be calculated from about one decade of time scales is lacking. What are the uncertainties?
- In line 230 it is stated that for a linear stochastic process increase in variance and increase in autocorrelation are independent. This is not true: For the OU process x, we have Var(x) ~ -1/(log AC(1)).
- The notation sigma^2 for Var(x) is unfortunate, since the underlying assumptions of the EWS is that the locally stationary process is the OU process: dx = -alpha x dt + sigma dB, where alpha_1 = alpha dt (lag-1) AND Var(x)=sigma^2/(2 alpha). Thus sigma^2 represents the square of the intensity of the noise. I recommend using Var(x) for the variance.
I hope these comments are useful for the authors.
Citation: https://doi.org/10.5194/egusphere-2024-3567-RC1 -
AC2: 'Reply on RC1', Clara Hummel, 10 Mar 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3567/egusphere-2024-3567-AC2-supplement.pdf
-
RC2: 'Comment on egusphere-2024-3567', John Slattery, 31 Jan 2025
General Comments:
This is an extremely thorough and comprehensive analysis of whether Early Warning Signals (EWS) are present before Dansgaard-Oeschger (DO) events in Greenland ice cores. Importantly, this study rigorously addresses the methodological limitations and discrepancies that have led to conflicting results in previous studies on this topic. The statistical approach used to detect EWS is justified by the close match between the analytical and numerical distributions for the number of false positives. Although this detection of significant EWS preceding individual transitions is carried out extremely well, I believe there is an issue with the analysis of whether the number of observed EWS is in turn significant itself. This could impact the findings and therefore needs to be carefully addressed. Overall, I recommend that this manuscript is published subject to revisions.
Specific Comments:
The only major issue in this otherwise excellent manuscript concerns the analysis of whether the number of observed EWS is significant. As the authors correctly state on line 218: "For x ∼ B(17, 0.05), it is P(x ≤ 2) ≈ 0.9497 < 0.95 and P(x ≤ 3) ≈ 0.9912 > 0.95." However, the authors then mistakenly infer from this that "at a confidence level of 95%, we expect at most two events to show spurious significant early warning, and observing three significant EWS is statistically significant." In fact, the number of EWS required for statistical significance at the 95% level is N, where N is the smallest integer such that, for x ∼ B(17, 0.05), P(x < N) > 0.95. The crucial difference is that the probability of x being less than but not equal to N must be more than 95%, not less than or equal to as the authors imply. The significance threshold at the 95% confidence level using this analytical distribution is therefore four significant EWS observed, not three.
One can consider this in an equivalent way that may be clearer by thinking instead about the p-value as compared to the significance level (i.e. 1 - confidence level). A result is significant at the 5% significance level if, under the null hypothesis, the probability p of observing a result at least this extreme is less than 5% (i.e. p < 0.05). In our case, the number of observed EWS required for significance is N, where N is the smallest integer such that, for x ∼ B(17, 0.05), P(x ≥ N) < 0.05. P(x ≥ 3) = 1 - 0.9497 = 0.0503 > 0.05, and so observing three EWS is not quite statistically significant at the 5% / 95% level, whilst observing four is.
To see clearly that the authors’ approach is mistaken, consider Figure 3(a&b). Both the analytical and numerical distributions show that there is a 16% probability of 2 out of the 17 transitions showing false positive EWS. Despite this, the authors indicate in 3b that observing two EWS is significant at the 95% level using the numerical distribution. Elsewhere, including Figure 10, they also indicate that observing two EWS is significant at the 90% level for both distributions. Observing two EWS cannot be significant at either confidence level or with either distribution, though, because this happens by chance 16% of the time! For another example, consider the distribution for simultaneous EWS in Figure 3c. Using the authors’ logic, P(X ≥ 0) > 0.95 and so 0 transitions with simultaneous EWS in both indicators would be a significant positive result, which clearly cannot be the case. I hope that these examples demonstrates that my comment here is not merely a statistical foible or a petty criticism, but that it has a real impact on the findings of this study.
The comparison of the analytical and numerical distributions is a fantastic way to show that the significance test for EWS preceding individual transitions works as intended, and I applaud the authors for including this. However, having done so, I think it would be better to then consider only the significance threshold for the number of observed EWS derived from the analytical distribution. This would simplify the analysis by making the threshold the same for all records and indicators. Currently there is sometimes (e.g. Figure 3b) a discrepancy between the thresholds for the two distributions, even though they match very well, just because P(x < 3) is so incredibly close to 0.95.
Line by line comments:
Line 22 and elsewhere: This study describes δ18O in Greenland ice cores as a local temperature proxy, following the traditional interpretation. However, recent isotope-enabled modelling (Buizert et al. 2024, https://doi.org/10.1073/pnas.2402637121) suggests that winter sea ice variation may instead be the dominant control on δ18O during DO events. I suggest that this new interpretation should be briefly discussed, either in the introduction or in Section 4.2.
Figure 3: I think it would also be better to place the significance threshold lines between integers, as it is currently unclear whether observing a number of EWS equal to the significance threshold is significant or not. Indeed, Figure 3c seems to suggest that the significance threshold is 0, if interpreted in the same way as a & b, which cannot be the case. This should of course also account for the corrected significance thresholds based on my main comment, and the same also applies to Figures A3 and S19.
Lines 253-254: "Furthermore, we don’t restrict the search for wavelet-based EWS to the GS until 200 years prior to events to include potential influences of the transitions themselves." This sentence is unclear to me.
Lines 287-288: "Though, observing two significant EWS in α1 is only significant with respect to the analytical, but not the numerical null-distribution.” Based on Figure 3b this appears to be the wrong way round, as the numerical threshold is two EWS whilst the analytical threshold is three. Either way, as mentioned above, I think it would simplify the analysis to consider only the analytical null distribution.
Figure 10: It is difficult to distinguish between zero and undefined using this colour scheme. The circles indicating significance should also be corrected as discussed above.
Lines 480-481: “Recent advancements in EWS methods have … introduced new methodologies (Clark et al., 2002).” It seems odd to call a study from 23 years ago a recent advancement. Perhaps a different reference was intended here, otherwise this sentence should be reworded slightly.
Lines 529-530: “It has been shown before that the current NGRIP site was located at a higher altitude and further upstream, closer to NGRIP than it is today.” This sentence is unclear. I think that the authors perhaps intended to write NEEM here instead of NGRIP.
Fig S3c in Supplementary Information: The line for the 95% confidence interval is either hidden or missing.
Technical Comments:
Line 202: “allows to handle data” is missing a word. This should perhaps read “allows us to handle data”.
Line 287: “…and autocorrelation for DO-12 Though, observing two significant EWS…”. I think there ought to be a full stop between “DO-12” and “Though”.
Line 314: “resolutions.Another” is missing a space after the full stop
Figure 7 caption: “(e-f) Same as (c-d) but with modified estimator calculation.(g-h) Same as (e-f) but with modified data preprocessing.Line colours and shadings are applied in the same way as in Fig. 4.” Spaces are missing after both full stops.
Line 459: “notably DO-1,6” is missing a space after the comma.
Lines 519-520: “(Guillevic et al., 2013; Seierstad et al., 2014; Capron et al., 2021; Steen-Larsen et al., 2013)” These references are not in chronological order.
Line 548: “on parts the record” is missing a word.
I hope that these comments are helpful, and I look forward to reading the authors’ response.
Citation: https://doi.org/10.5194/egusphere-2024-3567-RC2 -
AC1: 'Reply on RC2', Clara Hummel, 10 Mar 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3567/egusphere-2024-3567-AC1-supplement.pdf
-
AC1: 'Reply on RC2', Clara Hummel, 10 Mar 2025
-
RC3: 'Comment on egusphere-2024-3567', Marlene Klockmann, 07 Feb 2025
Hummel et al (ESD): Inconclusive Early warning signals for Dansgaard-Oeschger events across Greenland ice cores
Hummel and colleagues calculate a Early warning signal indicators for Greenland Stadial to Interstadial transitions based on d18O measurements from different Greenland ice cores. They asses the affect of a large set of methodological choices on the number of detected EWS. Their sensitivity analysis covers temporal resolution and irregular sampling, as well as specific methodological choices regarding preprocessing, sensitivity testing and the calculation of the EWS indicators themselves. Of the methodological choices, changing the sensitivity testing had the largest impact on the number of detected EWS. Also the temporal resolution of the record seems to have a strong impact. There is no DO-event for which a significant EWS can be identified across all tested methods, ice cores and resolutions. For a subset of two DO events (out of 17) an EWS based on increasing variance could be detected across two different ice cores (NGRIP & NEEM) and two different resolutions (5 and 10 years). The results highlight the challenges (or even impossibility?) in constraining the mechanism behind DO events based on EWS.
Given that EWS are of great interest for future climate and that DO events are one of the few known abrupt changes that occurred in the "recent" past for which the applicability of EWS can be tested, such study is of great importance and worth publishing. Having said that, I believe that some clarifications and some more physical context are needed before the manuscript can be accepted for publication. I list my major and minor concerns below:
Major comments (the order is not indicative of importance):
1. Regarding the choice of EWS indicators: Boers (2021) used another indicator, the restoration rate lambda, to avoid misinterpretations of false positives and EWS if the signal is coming from "increasing variance and auto-correlation of the external noise that forces the system". Would this be also relevant here? Why or why not? Is the difference here wrt Boers (2021) that for current observations we do not know yet if a transition will happen or not, while we know this for the past? (prediction vs classification problem?)
2. Regarding the method modifications wrt Boers (2018): Without being deep into the EWS methods, some of the modifications made in the present study may appear quite arbitrary. I suppose all changes were made to improve things or test a different equally plausible parameter choice. But I would ask the authors to explain the reasoning behind each step, why the tested method may be an improvement, or why it is important that a certain parameter or similar be tested.
3. Regarding the respective modification steps:
- Did you test the methods only in sequence or also individually, e.g. step 3 without having performed step 1 and 2 first? Do you expect that the effects of all modifications add up linearly?
- Why is step 3 the last thing you test? Should that not be the first (since you did not test different combinations of the modifications)? And given that step 3 does not seem to have any effect, could it not be omitted here for compactness (you might simply mention it in a side sentence)?
- Step 1 has three items but they are all changed together? What has the biggest effect? The TFTS surrogates or the "entire time series" vs "GS-only"?4. Interpretation of the obtained numbers of EWS: The major part of the results seems to be a mere reporting of which method modification led to how many detected EWS. Similar to comment 2, it would be good of the authors could provide some more interpretation of why the number of detected EWS changes in the different cases. And what the changes tell us about the respective transitions. Can we learn something about the nature of a transitions if it with one method an EWS is detected but with another not? Please provide some physical interpretation/context.
5. Regarding the length of the Stadials: Does it matter that the intervals over which the ESW indicators are calculated are of different lengths? Trends will depend very much on the interval chosen, is that of relevance here? Also, the stadial between DO1 and DO2 contains the LGM, does it make sense to include it?
6. Regarding the broader picture: After your analysis, would you say there is a "best way" to estimate EWS? Do we now know more or less about EWS in general and ice cores in particular?
7. Regarding uncertainty: In Section 4.2, uncertainties in d18O are also discussed. Is it possible to take the proxy uncertainty into account? Or is it already being accounted for and I missed it? Does it matter that proxy noise is typically not white? Would EWS calculated for ensemble mean across ice cores be insightful and perhaps more robust?
Minor comments:
e.g. l.14 and l.122: be careful with words like "physical mechanism" and "climate background" - can the EWS really give insight into the physical mechanisms in terms of actual processes/feedbacks? Do you not rather mean the underlying tipping dynamics/bifurcation types?l.52-53 Consider moving sentence to beginning of paragraph starting l.85
l.52/l.85 ff: The introduction opens the link between DO events and possible future AMOC transitions. Would this possible future transition not be more similar to the GI-GS transition, which is apparently very much understudied in terms of EWS? I very well understand that your focus is the GS-GI transition, but perhaps you could comment on this?
l.130-156: To me, it becomes not 100% clear which part of the resampling/interpolation of ice cores has been done for this study or already in previous studies. And regarding the NGRIP core(s), are the respective NGRIP cores independent cores or resampled/interpolated versions of the same irregularly sampled NGRIP core?
l.221-229 / l.188: What does it imply exactly, if the number of EWS is significant with respect to the analytical but not the numerical null-distribution? Is one a stronger/more meaningful constraint than the other?
l.269-273: The erroneous calculation was part of Boers (2018)?
l.277-278: Does this refer to the fact that the Boers (2018) curve is much smoother?
l.504-506: would a systematic offset affect variance and auto-correlation?
l.506-508: Please always name the cores consistently. For readers not super familiar with the exact locations of the respective cores, it is difficult to immediately identify, which are e.g. the summit cores.
l.528-532: I find this paragraph confusing, especially the second sentence. Please check the sentence and reformulate for clarity. The first sentence compares NEEM and NGRIP, the second sentence only mentions NGRIP, but four times. Is this correct?
l.548-551: Would you not think, that a robust EWS should be detected, regardless of the lab that processed the core? If an EWS indicator is affected by the processing lab, then the usefulness of the indicator is rather limited, no?
Finally, to aid the overall flow and the interpretation of the results in their physical context, I would suggest to switch Section 4.2 and 4.1. That would result in a much stronger ending, than the discussion of possible ice core differences.
Citation: https://doi.org/10.5194/egusphere-2024-3567-RC3 -
AC3: 'Reply on RC3', Clara Hummel, 10 Mar 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3567/egusphere-2024-3567-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Clara Hummel, 10 Mar 2025
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
216 | 62 | 20 | 298 | 24 | 15 | 11 |
- HTML: 216
- PDF: 62
- XML: 20
- Total: 298
- Supplement: 24
- BibTeX: 15
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 89 | 30 |
Germany | 2 | 27 | 9 |
China | 3 | 23 | 7 |
United Kingdom | 4 | 20 | 6 |
Norway | 5 | 19 | 6 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 89