the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
ENSO teleconnections in eddy-rich climate models
Abstract. We examine how ENSO atmospheric teleconnections are represented in a novel suite of coupled simulations with eddy-resolving ocean and high-resolution atmosphere, at an unprecedented grid spacing of ∼10 km in both components. The single-member, multi-decadal experiments have been performed under a coordinated protocol within the European Eddy-RIch Earth System Models (EERIE) project using three different models.
To assess the performance of the EERIE models, we design tailored metrics to encapsulate and quantify different aspects of the ENSO teleconnections: direct tropical response, Rossby wave sources, extra-tropical tropospheric and stratospheric anomalies, and surface impacts. The metrics are based on linear regressions on the Niño3.4 index of several atmospheric fields in early- and late winter. Additionally, we apply the same diagnostics to a set of complementary atmosphere-only simulations run at lower resolution (∼30 km, 10 members) and high resolution (∼10 km, 1 member), which allow to isolate the impact of atmospheric resolution and estimate the internal variability.
We find mixed results in the EERIE coupled simulations compared to previous generation eddy-parametrized and eddy-permitting models (maximum ∼25 km in both atmosphere and ocean). The performance, though overall positive, varies by season, region, and model configuration and a systematic improvement does not emerge clearly. Similarly, the atmosphere-only experiments also indicate limited advances from the increased atmospheric resolution. However, potential benefits may be hindered by the large uncertainty in the ENSO response due to internal variability and sampling.
- Preprint
(6904 KB) - Metadata XML
-
Supplement
(5205 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-547', Anonymous Referee #1, 20 Mar 2026
-
RC2: 'Comment on egusphere-2026-547', Anonymous Referee #2, 25 Mar 2026
Review of manuscript egusphere-2026-547 by Mezzina et al.
The study presented in this manuscript analyses the atmospheric response of various variables to ENSO in a set of coupled and atmosphere-only simulations with a focus on three eddy-rich simulations from the EERIE exercise. The authors compare the responses to ERA5.
Although the manuscript is of interest, the methodology used to compare the simulations between each other and with ERA5 is not very convincing. I am not sure that averaging the response fields in boxes, sometimes even taking the difference between two boxes, is the best way to compare the simulations. In addition, the calculation of the ranges (the error bars in the plots) is not very clear to me and the evaluation of the performance with the ranking may not be the best (see comments below). Therefore, I recommend major revisions.
Major comments:
- Concerning the ranking to evaluate the performance:
I would not associate the ranking with an assessment of the performance. Although it is considered the best representation of the reality, ERA5 is still a model output forced by sea surface temperatures as the ocean is not coupled. Moreover, this ranking is quite subjective as, even if a simulation is ranked high (e.g., between number 23 and 34), it does not mean that it is much “worse” than the HighResMIP simulations as the difference with the closest HighResMIP simulation may be very small. This difference value is not taken into account in the ranking. A metric using the actual range of HighResMIP may be more useful. As an example, in line 265, I would not say “they perform surprisingly well”, I would say that they are much closer to ERA5 than the other simulations.
If the authors choose to keep Fig. 10, I suggest to make it the third figure of the manuscript.
- Concerning the regional averages:
The boxes do not always fit the location of the anomalies. For example, in Fig. 5a, the southern box in the North Atlantic sector overlaps with both positive and negative anomalies. Isn’t it a problem? The values in this box would be closer to 0 than if a more southward box was chosen. Moreover, the authors add again uncertainty by taking the differences between the averages in the two boxes over the North Atlantic domain. Personally, I would find more interesting, and maybe more robust, to look at pattern correlations within basins (North Pacific, North America, and North Atlantic) to evaluate the representation of the teleconnection in comparison with ERA5.
- Concerning the error bars in Figures 2, 4, 6, and 7:
Could the authors explain in more details how the error bars are calculated? It is very unclear to me.
For IFS-LR-AMIP, I would have expected a use of the members in the calculation of the uncertainty, but it does not seem to be the case. Also, since there are no error bars on the HighResMIP simulations (coupled and AMIP), does it make sense to include the error bars when comparing with the ensemble range (see my comment below for line 296)?
Minor comments:
- Line 57: “a more realistic extra-tropical response”: this is a bit vague. What is the response the authors are talking about? In which basin? And how is it “more realistic”?
- Line 60: “the position of the North Pacific response”: do the authors mean the centre of the mean sea level pressure anomaly or something else? Please be more precise.
- Line 65: “more accurate late-winter ENSO teleconnections”: again this statement deserves more details. How is the accuracy measured and which field is used to assess the teleconnection (sea level pressure, precipitation,…)?
- Line 68: “the extra-tropical teleconnection”: what field do the authors refer to? And in which basin? North Atlantic?
- Lines 142-144: What is the spatial resolution of these observations? Are these observations ERA5? Are those observations eddy-rich? Please precise.
- Lines 153-154: For the atmosphere, why is 100 km in the high resolution while 50 km is in the low resolution category? Same comment for the ocean, why is 25 km considered as low-resolution? It is very confusing.
- Line 159: “Due to data issues”: what are those issues? Are those issues well known by the community? Is there a reference for this?
- Line 228: “all models”: I would not say all models, but rather most models.
- Figure 2: 1) the x-axis of the top row does not seem to correspond to the x-axis of the bottom row. Please check.
2) How can the differences in the longitudinal positions be assessed? For example, is a 10° difference big?
- Figures 2 and 7: I suggest to use a plus sign for the IFS-LR/HR-AMIP simulations since they correspond to ERA5-rec, that is shown with the black plus sign. Also, I would remove the black plus sign from the coupled figures on the left, although leaving the lines, and remove the black cross sign from the AMIP figures on the right. This would make the comparisons clearer and more in line with Figures 4 and 6.
- Lines 232-236: The supplementary figure S3 shows very noisy curves. I am wondering if the authors smoothed the curves before finding the position and amplitude of the precipitation anomaly? If not, how the results would change by smoothing the curves?
- Line 265: “from the group”: I suppose that the authors refer to all 34 simulations here. Is it right? Please adjust the text to make it clearer.
- Line 267: “where the response is usually too weak”: please precise for which simulations the response is too weak.
- Line 296: “HighResMIP”: I would say that IFS-FESOM is within the range of HighResMIP LR but not within the range of HighResMIP HR, unless the error bar is considered.
- Line 318: why using a box going so far south? 40-50°N should have been good enough, shouldn’t it?
- Figure S5: is the ranking shown here based on the absolute difference in longitude between the model and ERA5-hist?
- Line 336: “a switch that most HighResMIP members capture”: where can I see that?
- Line 357: “The observed response is also largely uncertain”: I see hatches on Fig. S6a. Therefore, why is the response uncertain?
- Section 4: Can we consider the IFS-LR AMIP simulation similar to ERA5 but without the data assimilation part? If yes, the results in Fig. 9b,d mainly come from the absence of assimilation and wave coupling. Is it right?
- Line 449: “the wind climatology”: it is not the wind climatology that is displayed in Fig. S12. Please say how it is related with the meridional gradient of the absolute vorticity?
Technical comments:
- Line 29 and others: Sometimes the initial of the first author’ first name appears in the citations, such as Roberts M. et al. 2018,…, Liu X. et al. 2021, whereas they should not.
- Line 88: “at surface” → at the surface
- Line 92: “Domaisen” → Domeisen
- Line 11: “simulation” → simulations
- In section 2.1.1: NG5, eORCA12, N640: as somebody not versed in grids, what do all those mean?
- Line 133: put a comma after “in this paper”
- Line 163: remove “previously”
- Line 171: “thus correlation and standard” → a correlation and a standard
- Line 172-173: “An additional property...in our plots.” Consider removing.
- Line 224: “over the over the” → over the
- Line 232: “thought” → though
- Line 268: “Fig. 4b” → Fig. 4a
- Caption Figure 4: “(a) Western Pacific (b) Tropical Pacific (c) Tropical Atlantic (d) Western Atlantic” → (a) Western Pacific, (b) Tropical Pacific, (c) Tropical Atlantic, and (d) Western Atlantic
“represent” → representing
- Line 302: “Fig. 6d” → Fig. 6c
- Caption Figure 7: “height anomalies averaged between 30°N-50°N and regressed on the N3.4 index”: I do not think it is what the authors actually do. The anomalies are obtained from the regression, right? Therefore, the “and regressed on the N3.4 index” should be deleted.
- Line 366: “statically” → statistically
- Caption Figure 9: “represent” → representing, “20123” → 2023
- Figure S10: “wrt refrence” → wrt reference
- Caption Figure S10: “ERA5-hist (1950-2014) is used as reference for HighResMIP and the EERIE coupled models, while ERA5-rec (1980-2023) is used for the EERIE AMIP simulations. While ERA5-rec (1980-2023) is used for the EERIE AMIP simulations.”→ ERA5-hist (1950-2014) is used as reference for HighResMIP and the EERIE coupled models, while ERA5-rec (1980-2023) is used for the EERIE AMIP simulations.
- Line 493: “generation eddy-permitting models” → generation of eddy-permitting models
- Caption Figure S12: please give the unit of the field displayed.
Citation: https://doi.org/10.5194/egusphere-2026-547-RC2 -
RC3: 'Comment on egusphere-2026-547', Anonymous Referee #3, 01 Apr 2026
Review for "ENSO teleconnections in eddy-rich climate models" by Mezzina et al., 2026, submitted to WCD.
This paper examines the atmospheric response to ENSO in three very high resolution (eddy rich) but single-ensemble simulations, using both coupled atmosphere-ocean models and AMIP type runs. Both the local tropical response and the remote upper-level and surface teleconnection are considered. The results indicate no clear improvement over past simulations with high (but not as high) resolution (HIGHResMIP).
I feel sympathetic toward the authors who presumably expected more from this joint effort and computationally expensive undertaking. Although the result is negative, it is nonetheless important and worthy of publication, if only to give pause to other scientists planning to perform similar simulations with their own eddy-rich model. The paper is well-written, with clear figures, and the authors provide a generally honest assessment of the models’ failings (except perhaps in the abstract). I also commend the authors for assessing the teleconnections separately for early and later winter, since the observed patterns are so different.
Unfortunately, a negative result requires careful consideration of how it is presented, so as to make it interesting and not needlessly long – admittedly a difficult task. The paper falls short in this regard because, while it is readily apparent that the resemblance to observations is not improved, I see no added value in ranking the results and comparing them systematically with HighResMIP and lower-resolution simulations – a rather tedious exercise that is both questionable and yields no additional insight. For starters, how does one assess the significance of such a ranking? Second, the boxes used to define the metrics appear highly subjective. Comparing spatial correlations would have made more sense to me, since at least the differences can be tested for statistical significance and are are less sensitive to the choice of domain.
More importantly, do we really learn anything from the ranking and the lengthy descriptions of it? It would be one thing if these models consistently outperformed the older ones and then the ranking could serve as a way to quantify this. As it stands, little is learned from carefully ranking the RWS, the upper teleconnection and the surface signal, beyond the rather general conclusion that the atmospheric response to ENSO SST anomalies is complex (“and entails more than the tropical response alone”, line 454).
Another issue, in my opinion, concerns the order of presentation of the results, although I am somewhat ambivalent on this point. On the one hand, one could argue that, following common practice, results should be presented in order of increasing complexity, beginning with the simpler AMIP simulations. On the other hand, such an approach would make it somewhat awkward to subsequently motivate the analysis of the coupled simulations, where most of the effort clearly lies, given their poor performance. In that sense, the current structure is understandable, and it is not obvious to me that a better alternative exists. It does raise the question of why one would have expected the presence of mesoscale activity in the ocean to improve the atmospheric response in the first place, although this seems to stem more from the design of the overall project than from the presentation choices made in the paper itself.
The sampling issue in the AMIP eddy-rich simulation compared to the lower-resolution ones is also problematic, since any improvement would have to be dramatic for the former to fall outside the range of internal variability of the latter. This limitation, however, is not unique to the AMIP study, as it also arises when comparing single realizations of eddy-rich models with older-generation simulations.
Finally, while I understand that the paper is intended as an assessment of these eddy-rich simulations with regards to ENSO, the manuscript as it stands is quite dry. It would benefit from some dynamical explanations both for context and as possible explanations for the lack of improvement in the ENSO simulation. For instance, a brief explanation of the mechanisms that affect the polar stratospheric vortex during ENSO would be welcome, as would attempting to connect the deficiencies in the TRWS and in the ensuing teleconnections to biases in the wave-guide or basic-state vorticity gradients.
Given these issues, I recommend major revisions of the manuscript before it can be considered for publication. At the very least, substantial shortening is needed, and perhaps some re-ordering as indicated above, if not of the paper, then of the motivation for the project. I would also strongly recommend skipping the ranking in favor of spatial correlations or another more meaningful metric; in any case, the reader does not need to plow through tedious comparisons of rankings for each model, variable, region, and season.
Major comments and suggestions for revision:
1) A more compelling introduction is needed
Since I am recommending a major overhaul of the manuscript, I will not point out every instance of wording that could be improved, but I will do so for the Introduction, which I believe requires a thorough rewrite and restructuring to make it clearer and more engaging. Here are some specific issues:
- The opening sentences are very vague. What is meant by “different climate processes”? What is meant by a “variety of ocean regimes”? It does not make for a very compelling opening.
- There is some confusion as to whether the text is referring to oceanic or atmospheric resolution. The first introductory sentence alludes to “horizontal resolution of Earth system models”, but the rest of the first paragraph and most of the second one refer to ocean resolution only. Atmospheric resolution is suddenly mentioned in line 45, with no previous reference to its importance or benefits.
- It takes two paragraphs to get to ENSO, which is the focus of the paper, but even so we are told nothing about the specific oceanic processes that may be better represented in eddy-rich models and result in more realistic ENSO SST patterns (only that there are “serval (sic) ways through which a better resolved ocean could affect ENSO teleconnections”). Maybe you could cite the recent Siqueira and Kirtman (2026) paper and discuss improvements in the basic state, cold tongue, SST-wind feedbacks.
- Likewise, the reader would appreciate some discussion of what better resolved atmospheric processes and aspects of the atmospheric circulation are expected to contribute to a more realistic ENSO teleconnection (sharper vorticity gradients? more accurate waveguides? synoptic eddy feedbacks? )
- Lines 58-64: Stating that the improvements in the high-resolution simulations were attributed to the increased ocean resolution is underwhelming – unless the authors are trying to say that the increased atmospheric resolution itself did not play a role (if so, it is unclear)
- Lines 58-64: Should there not first be a discussion of how the simulated ENSO teleconnection is flawed in current models so that we may understand the importance of “a more accurate representation of the position of the North Pacific response” while “no improvement in the strength of the teleconnection” ?
2) Discard or minimize the ranking: I think the authors should really reconsider devoting so much space to describing the specific absolute and relative ranking of each simulation, since differences may correspond to minute differences in the metric itself and the authors do not test for the significance of these differences (thus in the rankings).
Given that they are only dealing with three EERIE models and that all three simulated responses as well as that in ERA-5 can be shown in a single panel (e.g., as in Fig. 1 and 3), which allows for clear visual assessment of the differences, it seems odd to use the scatter plots, which are based on single subjective metrics, to assess the performance of the EERIE models. The “longitude” metric is particularly problematic since determining the maximum location of a pattern (especially precipitation) is often ambiguous. For instance, according to Fig. 2, HadGEM has a negative velocity potential maximum in the central Pacific “that is too strong and displaced eastward”, but I have trouble seeing the displacement by eye in Fig. 1. And as I mention in the next comment, the discrepancies in the Maritime Continent seem a lot more concerning to me.
I would suggest the authors start by showing the first two figures and then skip to Figure 10 (but without the specific rank values, which seems to suggest they believe those numbers are meaningful). In this way they would be including the HighResMIP simulations in their comparison, for completeness, but they would not need all the scatter plots and the accompanying tedious descriptions of the rankings. They could then simply refer to that figure after analyzing each variable to illustrate whether the EERIE models perform better than HRMIP-HR or not.
Skipping the comparison of the metrics based on latitudinal averages, subjective boxes, pairs of boxes, etc., would go a long ways towards making the manuscript more digestible, interesting and objective. It would also eliminate questions about how the confidence levels (error bars) were estimated.
Additionally, consider computing spatial correlations to compare the teleconnection patterns.
3) Discuss Figure 1 in more detail: I think this figure is worth more discussion. For starters, the discrepancies in the Maritime Continent are very large: the response in FESOM is more than 50% weaker than in ERA-5, while the Indian Ocean response in HadGEM is very displaced towards the west. Since the differences between the simulated and ERA-5 velocity potential ENSO regressions “likely result from distinct SST anomalies”, would it not be more appropriate to begin by showing the ENSO SST regressions themselves? To me this is the first sobering indicator that these eddy-rich simulations are not going to be miraculous in terms of the ENSO teleconnection, for ND at least. Maybe this should be stated explicitly
4) Response in JF: Why are the corresponding figures for Fig. 1 and 3 not presented for JF but referenced later (line 382)? I think all panels should show both ND and JF, as Figures 5 and 8 do.
5) TRWS: I would not say the TRWS around the dateline, north of the Equator, is “well-captured” by the models since it seems substantially weaker. The discussion of the differences in TRWS would be more interesting if you related it to differences in the divergence patterns and, later, in the resulting teleconnection patterns. For instance, HadGEM has the more realistic TRWS in the Central Pacific, consistent with stronger upper-level outflow (Fig. 1).
6) Upper-level winter teleconnection:
Please adjust your language regarding how well the models reproduce the ND response. You write that the models “broadly capture the response in the North Pacific” in ND and “adequately capture the signal in the North Pacific” in JF, which seem comparable assessments, but to me the models perform much better in JF. Also, you later state that “all three EERIE models score better than most HighResMIP members” and Figure 10 also suggests the simulated teleconnection patterns are better in JF.
I also don’t understand why you state that “the ND response in the North Pacific is not significant in IFS-NEMO” – I see hatching covering the entire negative anomaly. Perhaps you could also emphasize how the observed ND signal is strongest in the North Atlantic instead of the North Pacific, an aspect that no model reproduces (to my knowledge) - although it may be due solely to November (King et al. 2021). And as I said earlier, could you relate the deficiencies in the signal to differences in the TRWS?
7) Stratospheric polar vortex: This section cannot rely only on supplementary figures alone. Since I suggest you eliminate the scatter plots, Fig. S6 could be moved to the main text. I would suggest, however, that you elaborate on the “expected deceleration of the polar vortex”, so the reader can follow. Also, what is happening in HadGEM? Is this model LOWTOP? Why is the signal discontinuous?
8) AMIP runs: it is hardly surprising that simulations with prescribed SST perform better than coupled simulations. The more potentially important result is that high resolution brings no general improvement over the low-resolution ensemble, although, as the authors indicate, this may be a sampling issue and a fairer comparison using 10-member ensembles of each might reveal some relative benefit. Given this, one wonders whether the computational resources would have been better spent on a larger AMIP HR ensemble, and whether the expectation of improved performance in the coupled simulations was justified or realistic.
9) Line 77: It is stated here that, during El Niño, compensating upper-level convergence occurs in both the western and eastern Tropical Pacific, but the linear regressions in Figure 1 only show the former. Is this due to El Niño/La Niña asymmetries or the fact that it is early winter?
1) Finally, I am left wondering whether the eddy-rich versions of each of the three models show improvement relative to the older respective versions?
More minor comments:
- Line 21: I don’t think it is accurate to say the performance is “overall positive” since the same sentence acknowledges that a “systematic improvement does not emerge clearly”
- Line 50. Please provide a reference for the debate you mention.
- Line 52: should be “ENSO SST patterns”.
- Line 80: delete the text in parenthesis
- Line 97: delete “fully”.
- Line 139: please clarify what is meant by “dampened cycle” and “stronger variability”
- Line 159: fewer, not less
- Line 174: delete “these types of”
- Line 194: unclear what “pure” means
- Line 197: their conclusions regarding what?
- Line 229: change to “all three models”. Also delete “like” in the next line.
- Line 289: I would not describe the Atlantic anomalies in Fig. 5a as being located at high and midlatitudes. The centers are located at 55ºN and 30ºN, so subpolar and subtropical would be more appropriate.
- Line 315: “distinguished” is not the right word. Do you mean distinct?
- Line 375: please provide a reference for this well-documented response
- Line 379: does it matter if an observed anomaly that is not statistically significant is not reproduced by the models?
- Line 405: unclear what “values” refers to.
- Line 460: sentence is incomplete “whose causes are argued”.
- You may want to cite the recent paper by Siqueira and Kirtman (2026) on the accuracy of ocean eddy-resolving (~ 1°) models for ENSO SST anomalies (https://link.springer.com/article/10.1007/s00382-026-08137-9)
Citation: https://doi.org/10.5194/egusphere-2026-547-RC3 -
EC1: 'Comment on egusphere-2026-547', Camille Li, 13 Apr 2026
All three referees are positive about the value of the study, but have
some concerns regarding methodology and the clarity of its contribution
to knowledge about ENSO teleconnections in models. I believe the
manuscript can and should be a valuable and useful result for the
community.The next step is the Final Response by the authors. My suggestion to the
authors is that they focus on discussing the main concerns, along with a
description of how they will address them. It would be helpful for some
of the points (e.g., statistical significance) to include some
preliminary results to give a sense of what direction these will take
the manuscript. At this point, there is no need to address all the minor
and technical points specifically, as large parts of the manuscript are
likely to change if we move on to revisions (I expect this to be the
case, but will reserve judgment until the Final Response is received).The main concerns I feel most important to address, which run as themes
across the three reviews:1) Comparison of teleconnection signal to reanalysis and relative
ranking - methods, metrics, statistical signficance, ranking2) Assessment of internal variability / uncertainty ranges / error bars
and what we learn from these3) Overall framing of study, including adequate discussion of dynamical
implications, which is an important factor for the WCD audience (see
scope statement
https://www.weather-climate-dynamics.net/about/aims_and_scope.html). One
reviewer has made nice suggestions for the introduction, but I believe
this point applies to how the results are presented as well as summary
and discussion sections.Thanks to the reviewers for their careful consideration of the
manuscript, and I look forward to hearing from the authors.Citation: https://doi.org/10.5194/egusphere-2026-547-EC1
Data sets
EERIE simulations EERIE team https://eerie.cloud.dkrz.de
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 253 | 178 | 20 | 451 | 52 | 35 | 55 |
- HTML: 253
- PDF: 178
- XML: 20
- Total: 451
- Supplement: 52
- BibTeX: 35
- EndNote: 55
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review for "ENSO teleconnections in eddy-rich climate models" by Mezzina et al., 2026, submitted to WCD.
SUMMARY:
This study examines the ENSO atmospheric teleconnections mainly in three single-member eddy-rich coupled climate model experiments. A few metrics were created for the purpose of the evaluation. These includes responses over the tropical Pacific, north Pacific, the north Atlantic, and northern polar areas.
This should be an interesting and important study because of known problems in models reproduction of ENSO teleconnections and importance of ENSO teleconnections as sources of predictability, for example for seasonal to subseasonal forecasts.
Unfortunately, I find that there are serious weaknesses in the methodology and therefore the conclusions are not borne out by the results presented. For this reason, I could not (and would not) comment or ask question about the findings and conclusions regarding the performance of the models in terms of ENSO teleconnections.
The following major comments point to specific places in the manuscript where there are methodological issues.
MAJOR COMMENTS:
1. Lines 270 - 272, 294 - 296, 298 - 300, 301 - 302, 321 - 323, 326, etc: My criticism here is about the comparisons with ERA5 (or between model experiments). I assume the qualitative (better, best, worse, etc) comments from the comparisons are based on the mean values. The confidence levels, although shown on the figures, are not used in the evaluation nor discussed. There are substantial overlaps for these intervals.
I think at the very least there should be suitable statistical tests on the difference of the means between data with different variances. Statistical tests for climate teleconnections research is a standard thing to do.
In the current form of the analyses presented, I am unable to form an opinion for the validity of these evaluations.
2. Figures 2, 7: Are the longitudinal locations indicated the mean locations? What are ranges of variations in the longitudinal locations? Are the differences locations statistically significant?
3. Figure 10: I have doubt on the validity of the method used for these rankings, and that these rankings are robust. This is because of the same reason I give above. The relevant differences need to be checked for statistical signficance.
MINOR COMMENTS:
1. Section 2: Tables listing the model experiments and the main features for each would help give a clearer view.
2. Figure 2: Naming of experiments should be mentioned when they are described. See also my previous comment.
3. Line 231 (citing Fig. 10): For readability and convention, figures should be presented in order as they first referred to in the text. You jump from Fig. 2 to Fig. 10, and I had to scroll all the way to the end to find Fig. 10.
4. Line 302: Fig. 6d is for JF, not ND.
5. Line 366 "statically": statistically?
6. Supplementary figures: There are ten supplementary figures and they are cited extensively in description of the results and discussion in the main text. I don't think this is a correct use of supplementary figures. This affects the readability of the paper and processing of the information on the part of the reader. More attention is needed on what are the key results and focusing on presenting them nicely in the main body of the manuscript.