the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Deficient ocean–atmosphere feedbacks constrain seasonal NAO prediction
Abstract. As the North Atlantic Oscillation (NAO) accounts for a dominant share of wintertime weather variability across the North Atlantic basin, it is a coveted target for seasonal prediction. Yet dynamical forecast systems continue to exhibit limited skill, in part due to deficiencies in representing ocean–atmosphere feedbacks. Here, mediation analysis – a statistical framework from causal inference – is applied to identify and quantify feedback pathways linking late-autumn North Atlantic sea surface temperature (SST) anomalies to the subsequent winter NAO. This approach is attractive because it is straightforward to apply, easy to interpret, and can be used directly on observations-derived data like reanalyses without requiring idealised model perturbation experiments.
The analysis reveals a physically coherent feedback sequence. Anomalous November SST patterns promote the gradual formation of a surface-pressure dipole rotated clockwise relative to the canonical NAO structure. This dipole induces advection anomalies in the western North Atlantic, which in turn modulate surface fluxes in the Subpolar Gyre and lower-tropospheric baroclinicity in the storm-track entry region east of Newfoundland. These changes nudge the NAO, which, once established, feeds back onto the fluxes and baroclinicity, reinforcing the anomaly and sustaining the circulation pattern.
A central finding is that a state-of-the-art seasonal prediction system fails to capture these feedback mechanisms. The baroclinicity pathway, the process through which changes in eddy growth reinforce the circulation anomaly, is particularly deficient, accounting for only 2 % of the lagged SST–NAO correlation in SEAS5 compared with 44 % in the ERA5 reanalysis. This misrepresentation likely represents a fundamental barrier to improved NAO forecast skill.
More broadly, the results demonstrate the potential of mediation analysis as a diagnostic tool for disentangling coupled feedbacks directly from observations, evaluating their representation in models, and guiding targeted improvements that could enhance seasonal prediction of the NAO.
- Preprint
(2435 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 28 Nov 2025)
-
RC1: 'Comment on egusphere-2025-5075', Anonymous Referee #1, 27 Oct 2025
reply
-
AC1: 'Reply to RC1', Erik Kolstad, 28 Oct 2025
reply
I thank the reviewer for their thoughtful and constructive comments, which will be very helpful in improving the manuscript. I am encouraged by the reviewer’s assessment that the study is timely and relevant. Below I address the two major points raised; detailed revisions and responses to minor comments will be provided in the formal response.
Major point A: Use of October hindcasts
The motivation for using the October initialisations was to ensure sufficient variation in the November SST states. In the November runs, SSTs are nearly identical across ensemble members due to inertia, which I reckoned would reduce the usefulness of the data for mediation analysis. Nevertheless, I agree that November forecasts are operationally the most relevant for DJF predictions. I have therefore decided to base the analysis on the November initialisations. As stated in the paper, these results confirm that the main conclusions are not sensitive to this choice. The biases are somewhat smaller, but the mediation remains weak. For the November runs, the anomaly correlation coefficient between the modelled and observed NAO is 0.29 (p = 0.06), which is consistent with the findings of Baker et al. (2024), while for the October runs it was non-significant. Within SEAS5, the SST–NAO correlation is 0.19, smaller than for the October runs, indicating that the errors are not reduced as hypothesised.Major point B: Use of ensemble means
I fully agree with the reviewer that comparing ensemble-mean relationships to observational relationships is problematic, as the ensemble mean effectively filters out much of the internal variability. I had in fact originally used individual ensemble members but switched to ensemble means for simplicity. Following the reviewer’s suggestion, I have now repeated the analysis using all ensemble members. The results are broadly consistent with those based on ensemble means, though the relationship between the SST–NAO correlation and the indirect effect (Fig. 5) strengthens for SEAS5. In my opinion this reinforces the conclusion that feedback mechanisms play a central role in determining NAO predictability.Additionally, I have discovered a data-handling error affecting the Eady growth rate (EGR) calculations for the October runs. After correcting this issue, the indirect effect through baroclinicity increased and spatially it now more closely resembles the ERA5 result. As in ERA5, baroclinicity emerges as the dominant mediator (higher indirect effect than for the surface fluxes). I am grateful that the reviewer’s comments prompted this revision, which has clarified the results and improved the manuscript.
Citation: https://doi.org/10.5194/egusphere-2025-5075-AC1 -
AC2: 'Reply on AC1', Erik Kolstad, 30 Oct 2025
reply
I have identified and corrected a data-handling error in the processing of the model data. This issue affected the calculation of the indirect (mediated) effects, particularly for the baroclinicity (Eady growth rate) pathway.
In the corrected analysis, the model performs slightly better in reproducing the indirect effect – especially through the baroclinicity pathway. Broadly speaking, the main conclusions of the study stand, but the numerical results have been revised.
Following the editor’s suggestion, I have sent the corrected manuscript directly to the editor, who will share it with the remaining reviewer(s). This ensures that the remaining review is based on the corrected version, while keeping the discussion open and transparent for all readers.
Citation: https://doi.org/10.5194/egusphere-2025-5075-AC2
-
AC2: 'Reply on AC1', Erik Kolstad, 30 Oct 2025
reply
-
AC1: 'Reply to RC1', Erik Kolstad, 28 Oct 2025
reply
-
AC3: 'Correction of data handling error', Erik Kolstad, 05 Nov 2025
reply
I identified a data-handling error in the processing of the model data, related to the incorrect chronological sorting of years when concatenating files. I have now corrected this issue and re-run the analysis. Overall the results have changed slightly, but the main conclusions remain qualitatively the same. The PDF file contains new figures and summarises the updates to the findings.
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 277 | 73 | 19 | 369 | 5 | 4 |
- HTML: 277
- PDF: 73
- XML: 19
- Total: 369
- BibTeX: 5
- EndNote: 4
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This article analyses the relationships between N Atlantic SST, heat fluxes and the North Atlantic Oscillation in a seasonal forecast system. The questions it asks are well thought out and the study is timely, relevant and of interest to many readers of this journal. However, while it has the potential to be excellent on all counts, I have had to mark it 'fair' for scientific content as it stands, because if I understand correctly how the analysis has been done, then there is a potentially large error in the analysis method related to the use of ensemble means as explained below.
MAJOR POINTS:
A) I remain unconvinced about the use of October hindcasts. November starts are normally used for DJF forecasts so why not use the forecasts that are relevant to the problem? We should at least be reassured that November forecasts show similar, if perhaps weaker errors.
B) The analysis is novel, relevant and interesting but there is a serious flaw. The analysis is carried out entirely on ensemble means (L186) and then compared to the observations (L325, L360 and throughout). This comparison is not valid. A simple example can illustrate why: assume for example that the NAO is entirely formed from unpredictable variability or 'noise'. In this case there would be no ensemble mean signal and no regressions between the modelled variables. However, the observational analysis will still show relationships, albeit from unpredictable 'noise'. In reality the difference will be less extreme as the NAO contains predictable and unpredictable components but the presented analysis would only be valid if the NAO is formed from entirely predictable variability. Fortunately, the problem is easily corrected as it simply needs to be redone on ensemble members. I hope this can be done as I still think this has the potential to be a very useful contribution but it is essential before publication.
MINOR POINTS:
The article seems to be overly positive about empirical forecast methods. Several of the examples cited have not performed well after publication in real out of sample cases. This is often the case with such methods which have often been inadvertently tuned to non-causal relationships in sections of the past observational record. Please therefore refine the language to better represent this, for example by saying "...achieved potentially useful levels of skill (but note the comments below about real time forecast skill)..." and at L34: "often appear to outperform" as this is not really outperforming if based on noncausal factors.
L45: Suggest "high surface NAO" as some studies claim NAO skill from high level circulation fields that is not reflected in surface NAO predictions
L46: Baker et al 2024 reported similar levels of skill for the NAO from later generations of forecasts and similar ranking of systems so a better phrasing here would be "However, there is a wide range of performance between systems and system upgrades have not significantly improved overall skill". Please also remove comments about reducing skill as the reported changes are not significant.
L110: typo "aa"
L138: I did not understand why this implies 'many pathways'
Sec3.1: why is this particular system (ECMWF SEAS5) used? Is it because it has lower skill than some of the others (c.f. Sec 4.1) and so useful to detect errors? If so please say this.
L201-205: How are anomalies calculated in SEAS5 and ERA5?
P7 line 1: This seems odd as there are only N values to start with so by definition there are many repeates and samples are not independent. This will reduce spread and affect results like those in Fig.5. Is there a simple inflation of spread that can be done to correct and compensate for this?
L268: what is the mean bias in the NAO?
L290: typo 'gyrefor'
L297: please state of this represents a positive feedback
L384: robust
L394, L405: grammar at the start of these sentences, please reword