The ICON-based Earth System Model for Climate Predictions and Projections (ICON XPP v1.0)

Müller, Wolfgang A.; Lorenz, Stephan; Pham, Trang V.; Schneidereit, Andrea; Brokopf, Renate; Brovkin, Victor; Brüggemann, Nils; Chegini, Fatemeh; Dommenget, Dietmar; Fröhlich, Kristina; Früh, Barbara; Gayler, Veronika; Haak, Helmuth; Hagemann, Stefan; Hanke, Moritz; Ilyina, Tatiana; Jungclaus, Johann; Köhler, Martin; Korn, Peter; Kornblüh, Luis; Kroll, Clarissa; Krüger, Julian; Castro-Morales, Karel; Niemeier, Ulrike; Pohlmann, Holger; Polkova, Iuliia; Potthast, Roland; Riddick, Thomas; Schlund, Manuel; Stacke, Tobias; Wirth, Roland; Yu, Dakuan; Marotzke, Jochem

doi:10.5194/egusphere-2025-2473

Preprints

https://doi.org/10.5194/egusphere-2025-2473

Preprints

12 Jun 2025

| 12 Jun 2025

The ICON-based Earth System Model for Climate Predictions and Projections (ICON XPP v1.0)

Wolfgang A. Müller, Stephan Lorenz, Trang V. Pham, Andrea Schneidereit, Renate Brokopf, Victor Brovkin, Nils Brüggemann, Fatemeh Chegini, Dietmar Dommenget, Kristina Fröhlich, Barbara Früh, Veronika Gayler, Helmuth Haak, Stefan Hagemann, Moritz Hanke, Tatiana Ilyina, Johann Jungclaus, Martin Köhler, Peter Korn, Luis Kornblüh, Clarissa Kroll, Julian Krüger, Karel Castro-Morales, Ulrike Niemeier, Holger Pohlmann, Iuliia Polkova, Roland Potthast, Thomas Riddick, Manuel Schlund, Tobias Stacke, Roland Wirth, Dakuan Yu, and Jochem Marotzke

Abstract. We develop a new Earth System model configuration framed into the ICON architecture, which provides the baseline for the next generation of climate predictions and projections (hereafter ICON XPP – where XPP stands for eXtended Predictions and Projections). ICON XPP is an outcome of a joint project between climate research institutes and the Deutscher Wetterdienst, integrating numerical weather prediction and Earth System modeling and prediction based on the ICON framework. ICON XPP comprises the atmospheric component as used for the numerical weather prediction (ICON NWP), the ICON ocean and land surface components, and an ensemble-variational data assimilation system, all adjusted to an Earth System model for pursuing climate research and operational climate forecasting. Here, two baseline configurations are presented, one with a 160 km atmosphere and a 40 km ocean resolution, and one with 80 km atmosphere and 20 km ocean resolution, and a first evaluation is pursued based on the CMIP DECK (Diagnostic, Evaluation and Characterization of Klima) experimentation framework. Emphasis is given to the basic assessment of their mean climate, trends and climate sensitivity, and key processes in the tropics and mid-latitudes are examined, which are of relevance for climate predictions.

ICON XPP is able to depict the basic properties of the coupled climate. The pre-industrial climate shows a balanced radiation budget at the top-of-atmosphere and a mean global near-surface temperature of about 13.8–14 °C. The ocean shows circulation strengths in the range of the observed values, such as the AMOC at 16–18 Sv and the flows through the common passages. The current climate is characterized by a trend in the global mean temperature of ~1.2 °C since the 1850s, close to what is found in reference datasets. At regional scale, however, the hydroclimate deviates strongly from observed conditions. For example, the inter-tropical convergence zone (ITCZ) is dominated by a double peak with a particular wet southern subtropical branch over the oceans. Further, the climate in the Southern Ocean is characterized by a strong positive mean bias, with the sea surface temperature too high up to 5 °C.

Key dynamical processes are presented, such as the El Niño/Southern Oscillation (ENSO) whose overall performance fits with the CMIP6-like coupled models. However, in the present configuration, the amplitude is ⅔ of the observed values, and the ENSO feedbacks are underestimated. Further, tropical waves and the Madden-Julian Oscillation are captured well, and spontaneous weak quasi-biennial oscillation is found in the 40 km atmosphere configuration. The atmospheric dynamics at the extra-tropics of both configurations is particularly noteworthy. ICON XPP exhibits a good representation of the jet stream position, particularly in the northern extra-tropics. Closer investigations show that the influences of the transient momentum transports and their feedbacks on the jet stream are well reproduced in ICON XPP. Stratospheric dynamics further reveal a sufficiently strong polar vortex and an adequate number of sudden stratospheric warmings. A clear improvement is found for all processes for the higher-resolved configuration compared to the lower resolution. Overall, ICON XPP performs at a similar level in the tested climate simulations as climate models performed in CMIP6 and forms a good basis for application in the areas of climate forecasts and projections, as well as climate research.

How to cite. Müller, W. A., Lorenz, S., Pham, T. V., Schneidereit, A., Brokopf, R., Brovkin, V., Brüggemann, N., Chegini, F., Dommenget, D., Fröhlich, K., Früh, B., Gayler, V., Haak, H., Hagemann, S., Hanke, M., Ilyina, T., Jungclaus, J., Köhler, M., Korn, P., Kornblüh, L., Kroll, C., Krüger, J., Castro-Morales, K., Niemeier, U., Pohlmann, H., Polkova, I., Potthast, R., Riddick, T., Schlund, M., Stacke, T., Wirth, R., Yu, D., and Marotzke, J.: The ICON-based Earth System Model for Climate Predictions and Projections (ICON XPP v1.0), EGUsphere [preprint], https://doi.org/10.5194/egusphere-2025-2473, 2025.

Received: 27 May 2025 – Discussion started: 12 Jun 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Status: closed

RC1:
'Comment on egusphere-2025-2473', Anonymous Referee #1, 07 Aug 2025
This manuscript presents ICON XPP, the first version of a new climate model specifically tailored for climate prediction applications at two alternative model resolutions.
Both the model documentation and the evaluation of the two resolution configurations are comprehensive, spanning seasonal to decadal scales and covering important drivers of predictability such as the ocean and atmospheric circulations, the stratosphere and key modes of internal climate variability. The authors employ a wide range of diagnostics and benchmark against well-established datasets, demonstrating the model’s strengths and identifying areas for improvement.
I find the paper to be of interest and suitable for publication in GMD, pending the authors’ response to a few clarifications and minor comments listed below.
The manuscript is generally clear but presents frequent typos and grammatical errors. I recommend a thorough proofreading to improve its readability.

Sentence in lines 105-109: The final part of the sentence seems to be incomplete. Did you mean to say that MPI-ESM has been used for developing machine learning methodologies? If that’s the case, in which way?

Sentence in lines 109-110: The phrasing is odd. I would simply say that MPI-ESM has been used to conduct operational decadal climate predictions.

Sentence in lines 110-113: This sentence would benefit from some rephrasing too. I suggest simply saying that decadal prediction skill in the model has been shown to arise from near-term memory in the North Atlantic Ocean heat content and from the externally-forced long-term trends. Also, note that “prediction skill” should be written in singular.

Sentence in lines 113-116: Instead of indicating the processes for which the prediction skill has been assessed, it would be more interesting to state those for which the model shows actual skill, and those for which it doesn’t.

Lines 118-119: The carbon uptake by the ocean as not an Earth System component, it’s an Earth system process.

Line 172: What is TERRA? You have not properly introduced it.

Lines 244 to 246: Node characteristics can largely vary across machines. Could you also indicate the throughput in terms of the number of processors per day (to allow a more direct comparison with other models)?

Section 2.3: The manuscript would benefit from additional detail on the tuning procedure. Specifically, it would be helpful to explain how the parameter choices summarized in Table 1 were determined, whether they were based on tailored experiments, and if so, what kind of experiments were conducted.

Sentence in lines 270-271: I agree that reported TOA values are within the acceptable range and compare well with residual imbalances documented in other CMIP6 models. However, in the last 500 years of the spin-up simulations both the 160/40 and 80/20 configurations exhibit consistently positive and negative TOA imbalances, respectively, without oscillating around zero. This suggests a persistent net energy gain in one case and loss in the other, which may have implications for long-term climate stability and should be acknowledged and discussed in the manuscript.

Line 304: Could you indicate here and in Table 2 how many members you have run per ensemble?

Figure 8: In this figure the red color represents the 160/40 configuration and the blue color the 80/20 one, but in Figure 1 is the other way around. I suggest using the same color convention to ease the comparability of the figures.

Line 335: It would be fair to comment that the reference period for PIOMAS corresponds to a much warmer climate than for the preindustrial simulations, which implies that the preindustrial sea ice thickness is expected to be larger.

Line 344: A peak magnitude should be a flat number, not a range.

Line 350: I would explicitly say that you refer to the transport through ocean passages that are important for the climate system.

Figure 5: A key process for the AMOC and decadal variability (and predictability) in the North Atlantic is deep water mixing in the Labrador and Irminger Seas, which is controlled by density stratification. It would be extremely useful to show how they are represented in the two model configurations, given the goal of using them to perform decadal climate predictions.

Table 3: The title of the first column is incorrect. It’s not an experiment, but an ocean passage that you are listing in the column.

Line 489: I would change “key indicator for” with “critical parameter that determines the”.

Sentences in Lines 502-505: The assumption of linearity doesn't seem to hold in the last 130 years of the 80/20 configuration, which has an R square of 0.1 that is most likely not statistically significant. Can you discuss which implications this has for your estimate?

Line 520: Did you mean to say “principal modes of climate variability”?

Sentence in lines 528-529: You should specify that this statement refers to the atmosphere.

Sentence in lines 539-540: The phrasing could be improved. I suggest simply saying that you use OLR as it is generally assumed that is a reasonable proxy for deep tropical convection and precipitation.

Paragraphs in lines 541-560 and Figure 12: You discuss (and cite) first the symmetric component, but you show first the antisymmetric one. I would swap the two rows in the figure to follow the order of the discussion. Also, I suggest acknowledging in the text that your comparisons are just visual and do not consider any statistical significance. Indeed, it is unclear to what extent some of the highlighted improvements for the higher resolution happen by chance.

Sentence in lines 555-556: You refer to Figure 12d, but should it be to Figure 12a-c, that are the ones corresponding to the antisymmetric component.

Sentence in lines 589-590: I would be more clear if you specify that Tropflux is your reference for evaluation. I didn't notice until I read the caption of the next figure. Also, could you provide some details on that dataset and provide the corresponding reference? It is not a well-known product.

Line 601: Change “for all” to “for all metrics”.

Figure 14: It misses a legend in one of the spaghetti plots to explain what each line represents.

Sentence in lines 606-609: The link between SST and wind stress biases in the western and central Pacific is not entirely clear. Aside from the edges (150°E-160°E and 240°E-270°E), the SST slope is quite similar across both simulations and the reference dataset, which is not the case for wind stress.

Line 618: How do you define this ENSO amplitude?

Line 622: Can you explain why it is important to evaluate ENSO skewness?

Line 649: Please avoid using the term “significant” in this context as you have not really assessed statistical significance.

Lines 658-660: This sentence could be rephrased for clarity. I interpret that you mean to say that with the cheaper configuration you can more easily explore the space of hyperparameters in your model to identify potential tuning improvements for ENSO representation.

Lines 695-697: I don’t think this statement is correct as it is written. The way I understand it, in the extra-tropics, changes in the zonal and meridional jets are closely linked to changes in major modes of climate variability like the NAO, a link that needs to be well represented for the predictability of these modes and their climate impacts.

Sentence in lines 731-732: I don’t understand what you mean to say here.

Sentence in lines 769-770: For me the most important advantage of showing the wind anomalies is that they better show the downward propagation.

Sentence in lines 771-772: For a robust assessment on the simulated QBO periodicity you could perform a spectral analysis of the QBO index for the models and ERA5.

Sentence in lines 777-779: You cannot really tell if the improvement comes just from the enhanced vertical resolution or from other factors, as there are other things that also change between the two configurations.

Sentence in lines 785-786: Could you explain what is different in those experiments with respect to your experiments besides being performed only with the atmospheric component?

Sentence in lines 807-811: This comparison is subject to considerable uncertainty, as it is based on counts from a single decade. This is particularly relevant given that the historical simulations do not capture the observed internal variability. To assess whether the model produces a reasonable number of MSSWs, it would be more informative to construct a histogram of event counts across multiple decades, both in the simulations and in ERA5. It would also be useful to examine the seasonal distribution of MSSWs, i.e., the months in which they tend to occur. In your current plot, based on the selected decade, it appears that both models simulate MSSWs earlier in the season compared to ERA5, and the events also seem to be shorter in duration.
Citation: https://doi.org/10.5194/egusphere-2025-2473-RC1
RC2: 'Comment on egusphere-2025-2473', Anonymous Referee #2, 10 Aug 2025

Review of The ICON-based Earth System Model for Climate Predictions and Projections (ICON XPP v1.0)
This manuscript describes ICON XPP, a new Earth system model configuration that substitutes the previous MPI Earth system model, as part of the ICON framework. ICON XPP is tailored for climate predictions and projections at two available model resolutions, and aims to be used in the CMIP7 suite of models, after further tuning.
General review:
The manuscript provides a valuable contribution to the climate modelling community. The manuscript comprehensively documents and evaluates the ICON XPP model components, configuration and its tuning on decadal and seasonal simulation scales, at both model resolutions. Key drivers of climate prediction and projection are presented and assessed, comparing against other well-established models and ERA5 observational datasets. Assessment and comparison of the models’ ability to simulate key measures of climate state, such as climate sensitivity and also key dynamical processes clearly demonstrates where the model performs well and where further improvement could be beneficial. Overall, the manuscript is generally well-written, and the authors effectively present the evaluation of this new Earth system model using key metrics.
I therefore find this paper suitable for publication in GMD, after the author’s response/ corrections to a few minor revisions below:
General:
Overall the manuscript is well-written, however I believe the manuscript could be condensed as there are a number of lengthy sentences/ paragraphs throughout, that scould be more concise for clarity and understanding.
There are also a number of spelling and grammatical mistakes throughout, I have highlighted some of these below, but I recommend a thorough read through to correct these.
General Abstract:
The abstract is too long, in my opinion. Ideally the abstract could be condensed into a more concise version, with only key results and information presented. There is some information I believe could be removed/ sentences condensed to reduce the size.
Introduction general:
While the introduction provides a good presentation of background information required for understanding and guiding the reader through the manuscript, I find the could be structured improved. There is some repetition in some paragraphs, or others seem to continue discussing a point made in previous paragraphs throughout all paragraphs. I would recommend some minor restructuring of the introduction, so it easily flows and provides a succinct background to the great main body of the manuscript.
e.g Line 82: “Here, we present ICON XPP, from the design of the configurations to a first evaluation of the Earth System state based on the CMIP DECK (Diagnostic, Evaluation and Characterization of Klima) experimentation framework (Eyring et al., 2016).”
This reads as if it should be placed at the end of the introduction? But this could be a personal opinion.
Minor corrections:
Line 47: ”At regional scale”. Should this read: “At a regional scale”?
Figure 1 and Figure 8: The colours used for each model resolution is switched in these figures, ensuring all figures have a consistent colour scheme would improve clarity.
Lines 76-78: “Since 2020, a new modelling initiative integrating numerical weather forecast, climate predictions and climate projections based on the ICON framework (Müller et al., 2025).” This sentence seems unfinished?
Line 79: ‘-,’ - The comma isn't needed.
Line 80: “ICON XPP will be the baseline for next generation climate predictions”… Should be: “ICON XPP will be the baseline for the next generation of climate predictions”…?
Line 107: Needs to be clearer
Lines 84-87: “Special attention is given to monitoring certain aspects of the tropical and extra-tropical mean climate, and the stratosphere, including key modes of variability and their predictability, such as the El Niño/Southern Oscillation (ENSO), or the North Atlantic Oscillation (NAO).” - This sentence is slightly verbose and clunky.
Line 88: “for the individual components” - Leaves the reader wondering what the individual components are. Could point to a table or list of the components, or leave the “individual components” part out?
Lines 107-108: “and it is recently for machine learning methodologies to assess…” - Should read, “and it has recently been used for machine learning methodologies to assess…”? If not, this sentence should be re-phrased.
Line 109: “MPI-ESM has been also used for decadal climate” - Should read, “MPI-ESM has also been used for decadal climate”.
Line 172: “TERRA” isn’t defined? A definition or acronym description here would be useful.
Line 264: “2 metre” - Should it read “2 meters”?
Lines 262-265: “The targets mainly consider the thermodynamic state of the atmosphere - depicted by the top-of-atmosphere radiation balance and global-mean temperature at 2 metre - and the ocean-cryosphere - by the strength of the Atlantic meridional overturning circulation (AMOC) and sea-ice properties” - A long sentence, could split into 2? The whole first paragraph of the 2.3 Tuning section could be re-worded for flow and clarity.
Line 276: “@26° N” – is this the correct symbol?
Line 277: “a small trend remains for the AMOC at the end of the simulation.” - Should this read ‘by the end of simulation”, instead of “at the end of simulation”? This would provide more clarity for me, implying a trend still remains by the end of the simulation? The authors could also elaborate on this trend more? A negative trend/ the AMOC weakens?
Line 284: “sediment,.” – should be “sediment.”? No need for extra comma.
Table 1: Could the sea ice parameters have units? i.e Sea ice melting 0.25, is this millionkm²? Or is there a reason no units are provided? Some more information could be provided on the tuning process for the reader to understand how each value was reached.
Line 307: “The climate sensitivity is estimated by 1%...”, should read “The climate sensitivity is estimated by a 1%...”?
Line 324: Again, I may have missed it but “AMIP-type”, could be defined or the acronym breakdown provided.
Line 328: “…for 160/40 and ~1.7 °C 80/20,…”, missing a ‘for’? should read “…for 160/40 and ~1.7 °C for 80/20,…”
Line 330: "The sea-ice reveal reasonable…”, missing ‘simulations’? Should read: “The sea-ice simulations reveal”?
Line 330: “peak season”/ “minimum season” could be defined? As the terms “peak season”/ “winter season/ growth season” etc are used interchangeably. Could be useful to provide the months referred to in brackets.
Line 344: “peak magnitudes of ~15-20 Sv at 26° N at 1000m depth,…”, should it read “and 1000m depth…”? In addition, should the peak magnitude be a range, or a single number?
Table 3: misaligned data in the observations column/ Denmark Strait row?
Line 423: “a small ensemble of three…” – I would be interested to know why three were generated? Was this a computational constraint, or something else? An explanation in-text could be useful?
Line 437: “The causes are currently unclear, and further investigations are in progress.” Is there any literature available yet? Is it possible to cite these further investigations?
Line 448: “Smaller errors” – could be made clearer? “Smaller margin of error”?
Table 4: It would be nice to see Table 4 placed nearer to where comparison to observations is mentioned, if possible.
Line 577: Could define the “cold tongue bias” better, this sentence could also benefit from some citation.
Line 588: “Nino3.4-related”, could you introduce this? A definition or explanation would be useful here.
Line 590: “As many coupled models…” – should read “As in many coupled models…”?
Line 691: “Among other…” – should read “Among others…”?
Line 698: “… and meanwhile…”, do you need both?
Line 716: Full stop missing after citation (Müller et al., 2018)
Line 737: “well-behaviour”, is this the appropriate word? Are you referring to good performance?
Line 786: Brackets are needed around the citation (Niemeier et al., 2024), or sentence needs re-phrasing if not.
Line 795: “However, several factors – among others volcanic eruptions…”, grammatical errors here/ more punctuation needed.
Line 806: The sentence would read better if it were “winter northern hemisphere” not “hemispheric”? Punctuation is also missing from this sentence: “shows, for example, …”.
Line 812: “exhibits comparable variations as in observations”, this sentence could be much shorter e.g “comparable to observations”.

Citation: https://doi.org/10.5194/egusphere-2025-2473-RC2
AC1: 'Comment on egusphere-2025-2473 - Response to Review #1', Wolfgang Müller, 12 Sep 2025

Pls find an uploaded file that contains point-by-point response to the reviewer's comments.

Citation: https://doi.org/10.5194/egusphere-2025-2473-AC1
AC2: 'Comment on egusphere-2025-2473 - Response to Review #2', Wolfgang Müller, 12 Sep 2025

Pls find an uploaded file that contains point-by-point response to the reviewer's comments.

Citation: https://doi.org/10.5194/egusphere-2025-2473-AC2

Status: closed

RC1:
'Comment on egusphere-2025-2473', Anonymous Referee #1, 07 Aug 2025
This manuscript presents ICON XPP, the first version of a new climate model specifically tailored for climate prediction applications at two alternative model resolutions.
Both the model documentation and the evaluation of the two resolution configurations are comprehensive, spanning seasonal to decadal scales and covering important drivers of predictability such as the ocean and atmospheric circulations, the stratosphere and key modes of internal climate variability. The authors employ a wide range of diagnostics and benchmark against well-established datasets, demonstrating the model’s strengths and identifying areas for improvement.
I find the paper to be of interest and suitable for publication in GMD, pending the authors’ response to a few clarifications and minor comments listed below.
The manuscript is generally clear but presents frequent typos and grammatical errors. I recommend a thorough proofreading to improve its readability.

Sentence in lines 105-109: The final part of the sentence seems to be incomplete. Did you mean to say that MPI-ESM has been used for developing machine learning methodologies? If that’s the case, in which way?

Sentence in lines 109-110: The phrasing is odd. I would simply say that MPI-ESM has been used to conduct operational decadal climate predictions.

Sentence in lines 110-113: This sentence would benefit from some rephrasing too. I suggest simply saying that decadal prediction skill in the model has been shown to arise from near-term memory in the North Atlantic Ocean heat content and from the externally-forced long-term trends. Also, note that “prediction skill” should be written in singular.

Sentence in lines 113-116: Instead of indicating the processes for which the prediction skill has been assessed, it would be more interesting to state those for which the model shows actual skill, and those for which it doesn’t.

Lines 118-119: The carbon uptake by the ocean as not an Earth System component, it’s an Earth system process.

Line 172: What is TERRA? You have not properly introduced it.

Lines 244 to 246: Node characteristics can largely vary across machines. Could you also indicate the throughput in terms of the number of processors per day (to allow a more direct comparison with other models)?

Section 2.3: The manuscript would benefit from additional detail on the tuning procedure. Specifically, it would be helpful to explain how the parameter choices summarized in Table 1 were determined, whether they were based on tailored experiments, and if so, what kind of experiments were conducted.

Sentence in lines 270-271: I agree that reported TOA values are within the acceptable range and compare well with residual imbalances documented in other CMIP6 models. However, in the last 500 years of the spin-up simulations both the 160/40 and 80/20 configurations exhibit consistently positive and negative TOA imbalances, respectively, without oscillating around zero. This suggests a persistent net energy gain in one case and loss in the other, which may have implications for long-term climate stability and should be acknowledged and discussed in the manuscript.

Line 304: Could you indicate here and in Table 2 how many members you have run per ensemble?

Figure 8: In this figure the red color represents the 160/40 configuration and the blue color the 80/20 one, but in Figure 1 is the other way around. I suggest using the same color convention to ease the comparability of the figures.

Line 335: It would be fair to comment that the reference period for PIOMAS corresponds to a much warmer climate than for the preindustrial simulations, which implies that the preindustrial sea ice thickness is expected to be larger.

Line 344: A peak magnitude should be a flat number, not a range.

Line 350: I would explicitly say that you refer to the transport through ocean passages that are important for the climate system.

Figure 5: A key process for the AMOC and decadal variability (and predictability) in the North Atlantic is deep water mixing in the Labrador and Irminger Seas, which is controlled by density stratification. It would be extremely useful to show how they are represented in the two model configurations, given the goal of using them to perform decadal climate predictions.

Table 3: The title of the first column is incorrect. It’s not an experiment, but an ocean passage that you are listing in the column.

Line 489: I would change “key indicator for” with “critical parameter that determines the”.

Sentences in Lines 502-505: The assumption of linearity doesn't seem to hold in the last 130 years of the 80/20 configuration, which has an R square of 0.1 that is most likely not statistically significant. Can you discuss which implications this has for your estimate?

Line 520: Did you mean to say “principal modes of climate variability”?

Sentence in lines 528-529: You should specify that this statement refers to the atmosphere.

Sentence in lines 539-540: The phrasing could be improved. I suggest simply saying that you use OLR as it is generally assumed that is a reasonable proxy for deep tropical convection and precipitation.

Paragraphs in lines 541-560 and Figure 12: You discuss (and cite) first the symmetric component, but you show first the antisymmetric one. I would swap the two rows in the figure to follow the order of the discussion. Also, I suggest acknowledging in the text that your comparisons are just visual and do not consider any statistical significance. Indeed, it is unclear to what extent some of the highlighted improvements for the higher resolution happen by chance.

Sentence in lines 555-556: You refer to Figure 12d, but should it be to Figure 12a-c, that are the ones corresponding to the antisymmetric component.

Sentence in lines 589-590: I would be more clear if you specify that Tropflux is your reference for evaluation. I didn't notice until I read the caption of the next figure. Also, could you provide some details on that dataset and provide the corresponding reference? It is not a well-known product.

Line 601: Change “for all” to “for all metrics”.

Figure 14: It misses a legend in one of the spaghetti plots to explain what each line represents.

Sentence in lines 606-609: The link between SST and wind stress biases in the western and central Pacific is not entirely clear. Aside from the edges (150°E-160°E and 240°E-270°E), the SST slope is quite similar across both simulations and the reference dataset, which is not the case for wind stress.

Line 618: How do you define this ENSO amplitude?

Line 622: Can you explain why it is important to evaluate ENSO skewness?

Line 649: Please avoid using the term “significant” in this context as you have not really assessed statistical significance.

Lines 658-660: This sentence could be rephrased for clarity. I interpret that you mean to say that with the cheaper configuration you can more easily explore the space of hyperparameters in your model to identify potential tuning improvements for ENSO representation.

Lines 695-697: I don’t think this statement is correct as it is written. The way I understand it, in the extra-tropics, changes in the zonal and meridional jets are closely linked to changes in major modes of climate variability like the NAO, a link that needs to be well represented for the predictability of these modes and their climate impacts.

Sentence in lines 731-732: I don’t understand what you mean to say here.

Sentence in lines 769-770: For me the most important advantage of showing the wind anomalies is that they better show the downward propagation.

Sentence in lines 771-772: For a robust assessment on the simulated QBO periodicity you could perform a spectral analysis of the QBO index for the models and ERA5.

Sentence in lines 777-779: You cannot really tell if the improvement comes just from the enhanced vertical resolution or from other factors, as there are other things that also change between the two configurations.

Sentence in lines 785-786: Could you explain what is different in those experiments with respect to your experiments besides being performed only with the atmospheric component?

Sentence in lines 807-811: This comparison is subject to considerable uncertainty, as it is based on counts from a single decade. This is particularly relevant given that the historical simulations do not capture the observed internal variability. To assess whether the model produces a reasonable number of MSSWs, it would be more informative to construct a histogram of event counts across multiple decades, both in the simulations and in ERA5. It would also be useful to examine the seasonal distribution of MSSWs, i.e., the months in which they tend to occur. In your current plot, based on the selected decade, it appears that both models simulate MSSWs earlier in the season compared to ERA5, and the events also seem to be shorter in duration.
Citation: https://doi.org/10.5194/egusphere-2025-2473-RC1
RC2: 'Comment on egusphere-2025-2473', Anonymous Referee #2, 10 Aug 2025

Review of The ICON-based Earth System Model for Climate Predictions and Projections (ICON XPP v1.0)
This manuscript describes ICON XPP, a new Earth system model configuration that substitutes the previous MPI Earth system model, as part of the ICON framework. ICON XPP is tailored for climate predictions and projections at two available model resolutions, and aims to be used in the CMIP7 suite of models, after further tuning.
General review:
The manuscript provides a valuable contribution to the climate modelling community. The manuscript comprehensively documents and evaluates the ICON XPP model components, configuration and its tuning on decadal and seasonal simulation scales, at both model resolutions. Key drivers of climate prediction and projection are presented and assessed, comparing against other well-established models and ERA5 observational datasets. Assessment and comparison of the models’ ability to simulate key measures of climate state, such as climate sensitivity and also key dynamical processes clearly demonstrates where the model performs well and where further improvement could be beneficial. Overall, the manuscript is generally well-written, and the authors effectively present the evaluation of this new Earth system model using key metrics.
I therefore find this paper suitable for publication in GMD, after the author’s response/ corrections to a few minor revisions below:
General:
Overall the manuscript is well-written, however I believe the manuscript could be condensed as there are a number of lengthy sentences/ paragraphs throughout, that scould be more concise for clarity and understanding.
There are also a number of spelling and grammatical mistakes throughout, I have highlighted some of these below, but I recommend a thorough read through to correct these.
General Abstract:
The abstract is too long, in my opinion. Ideally the abstract could be condensed into a more concise version, with only key results and information presented. There is some information I believe could be removed/ sentences condensed to reduce the size.
Introduction general:
While the introduction provides a good presentation of background information required for understanding and guiding the reader through the manuscript, I find the could be structured improved. There is some repetition in some paragraphs, or others seem to continue discussing a point made in previous paragraphs throughout all paragraphs. I would recommend some minor restructuring of the introduction, so it easily flows and provides a succinct background to the great main body of the manuscript.
e.g Line 82: “Here, we present ICON XPP, from the design of the configurations to a first evaluation of the Earth System state based on the CMIP DECK (Diagnostic, Evaluation and Characterization of Klima) experimentation framework (Eyring et al., 2016).”
This reads as if it should be placed at the end of the introduction? But this could be a personal opinion.
Minor corrections:
Line 47: ”At regional scale”. Should this read: “At a regional scale”?
Figure 1 and Figure 8: The colours used for each model resolution is switched in these figures, ensuring all figures have a consistent colour scheme would improve clarity.
Lines 76-78: “Since 2020, a new modelling initiative integrating numerical weather forecast, climate predictions and climate projections based on the ICON framework (Müller et al., 2025).” This sentence seems unfinished?
Line 79: ‘-,’ - The comma isn't needed.
Line 80: “ICON XPP will be the baseline for next generation climate predictions”… Should be: “ICON XPP will be the baseline for the next generation of climate predictions”…?
Line 107: Needs to be clearer
Lines 84-87: “Special attention is given to monitoring certain aspects of the tropical and extra-tropical mean climate, and the stratosphere, including key modes of variability and their predictability, such as the El Niño/Southern Oscillation (ENSO), or the North Atlantic Oscillation (NAO).” - This sentence is slightly verbose and clunky.
Line 88: “for the individual components” - Leaves the reader wondering what the individual components are. Could point to a table or list of the components, or leave the “individual components” part out?
Lines 107-108: “and it is recently for machine learning methodologies to assess…” - Should read, “and it has recently been used for machine learning methodologies to assess…”? If not, this sentence should be re-phrased.
Line 109: “MPI-ESM has been also used for decadal climate” - Should read, “MPI-ESM has also been used for decadal climate”.
Line 172: “TERRA” isn’t defined? A definition or acronym description here would be useful.
Line 264: “2 metre” - Should it read “2 meters”?
Lines 262-265: “The targets mainly consider the thermodynamic state of the atmosphere - depicted by the top-of-atmosphere radiation balance and global-mean temperature at 2 metre - and the ocean-cryosphere - by the strength of the Atlantic meridional overturning circulation (AMOC) and sea-ice properties” - A long sentence, could split into 2? The whole first paragraph of the 2.3 Tuning section could be re-worded for flow and clarity.
Line 276: “@26° N” – is this the correct symbol?
Line 277: “a small trend remains for the AMOC at the end of the simulation.” - Should this read ‘by the end of simulation”, instead of “at the end of simulation”? This would provide more clarity for me, implying a trend still remains by the end of the simulation? The authors could also elaborate on this trend more? A negative trend/ the AMOC weakens?
Line 284: “sediment,.” – should be “sediment.”? No need for extra comma.
Table 1: Could the sea ice parameters have units? i.e Sea ice melting 0.25, is this millionkm²? Or is there a reason no units are provided? Some more information could be provided on the tuning process for the reader to understand how each value was reached.
Line 307: “The climate sensitivity is estimated by 1%...”, should read “The climate sensitivity is estimated by a 1%...”?
Line 324: Again, I may have missed it but “AMIP-type”, could be defined or the acronym breakdown provided.
Line 328: “…for 160/40 and ~1.7 °C 80/20,…”, missing a ‘for’? should read “…for 160/40 and ~1.7 °C for 80/20,…”
Line 330: "The sea-ice reveal reasonable…”, missing ‘simulations’? Should read: “The sea-ice simulations reveal”?
Line 330: “peak season”/ “minimum season” could be defined? As the terms “peak season”/ “winter season/ growth season” etc are used interchangeably. Could be useful to provide the months referred to in brackets.
Line 344: “peak magnitudes of ~15-20 Sv at 26° N at 1000m depth,…”, should it read “and 1000m depth…”? In addition, should the peak magnitude be a range, or a single number?
Table 3: misaligned data in the observations column/ Denmark Strait row?
Line 423: “a small ensemble of three…” – I would be interested to know why three were generated? Was this a computational constraint, or something else? An explanation in-text could be useful?
Line 437: “The causes are currently unclear, and further investigations are in progress.” Is there any literature available yet? Is it possible to cite these further investigations?
Line 448: “Smaller errors” – could be made clearer? “Smaller margin of error”?
Table 4: It would be nice to see Table 4 placed nearer to where comparison to observations is mentioned, if possible.
Line 577: Could define the “cold tongue bias” better, this sentence could also benefit from some citation.
Line 588: “Nino3.4-related”, could you introduce this? A definition or explanation would be useful here.
Line 590: “As many coupled models…” – should read “As in many coupled models…”?
Line 691: “Among other…” – should read “Among others…”?
Line 698: “… and meanwhile…”, do you need both?
Line 716: Full stop missing after citation (Müller et al., 2018)
Line 737: “well-behaviour”, is this the appropriate word? Are you referring to good performance?
Line 786: Brackets are needed around the citation (Niemeier et al., 2024), or sentence needs re-phrasing if not.
Line 795: “However, several factors – among others volcanic eruptions…”, grammatical errors here/ more punctuation needed.
Line 806: The sentence would read better if it were “winter northern hemisphere” not “hemispheric”? Punctuation is also missing from this sentence: “shows, for example, …”.
Line 812: “exhibits comparable variations as in observations”, this sentence could be much shorter e.g “comparable to observations”.

Citation: https://doi.org/10.5194/egusphere-2025-2473-RC2
AC1: 'Comment on egusphere-2025-2473 - Response to Review #1', Wolfgang Müller, 12 Sep 2025

Pls find an uploaded file that contains point-by-point response to the reviewer's comments.

Citation: https://doi.org/10.5194/egusphere-2025-2473-AC1
AC2: 'Comment on egusphere-2025-2473 - Response to Review #2', Wolfgang Müller, 12 Sep 2025

Pls find an uploaded file that contains point-by-point response to the reviewer's comments.

Citation: https://doi.org/10.5194/egusphere-2025-2473-AC2

Viewed

Total article views: 2,396 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,254	118	24	2,396	40	47

HTML: 2,254
PDF: 118
XML: 24
Total: 2,396
BibTeX: 40
EndNote: 47

Views and downloads (calculated since 12 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	171	27	5	203
Jul 2025	118	28	1	147
Aug 2025	439	19	4	462
Sep 2025	1,254	16	5	1,275
Oct 2025	140	18	4	162
Nov 2025	132	10	5	147

Cumulative views and downloads (calculated since 12 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	171	27	5	203
Jul 2025	118	28	1	147
Aug 2025	439	19	4	462
Sep 2025	1,254	16	5	1,275
Oct 2025	140	18	4	162
Nov 2025	132	10	5	147

Viewed (geographical distribution)

Total article views: 2,350 (including HTML, PDF, and XML) Thereof 2,350 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Nov 2025

Short summary

ICON XPP is a newly developed Earth System model configuration based on the ICON modeling framework. It merges accomplishments from the recent operational numerical weather prediction model with well-established climate components for the ocean, land and ocean-biogeochemistry. ICON XPP reaches typical targets of a coupled climate simulation, and is able to run long integrations and large-ensemble experiments, making it suitable for climate predictions and projections, and for climate research.


Total:	0
HTML:	0
PDF:	0
XML:	0