AIMIP Phase 1: systematic evaluations of AI weather and climate models

Henn, Brian; Bretherton, Christopher S.; Koldunov, Nikolay; Lessig, Christian; Molina, Maria J.; Arcomano, Troy; Watt-Meyer, Oliver; Couairon, Guillaume; Singh, Renu; Brunstein, Robert; Hasson, Yana; Jost, Antonia; Brenowitz, Noah; Manshausen, Peter; Cresswell-Clay, Nathaniel; Durran, Dale; Chen Hall, Kyle Joseph; Yuval, Janni; Kochkov, Dmitrii; Hoyer, Stephan; Lopez-Gomez, Ignacio

doi:10.48550/arXiv.2605.06944

Preprints

https://doi.org/10.48550/arXiv.2605.06944

Preprints

01 Jun 2026

| 01 Jun 2026

Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

AIMIP Phase 1: systematic evaluations of AI weather and climate models

Brian Henn, Christopher S. Bretherton, Nikolay Koldunov, Christian Lessig, Maria J. Molina, Troy Arcomano, Oliver Watt-Meyer, Guillaume Couairon, Renu Singh, Robert Brunstein, Yana Hasson, Antonia Jost, Noah Brenowitz, Peter Manshausen, Nathaniel Cresswell-Clay, Dale Durran, Kyle Joseph Chen Hall, Janni Yuval, Dmitrii Kochkov, Stephan Hoyer, and Ignacio Lopez-Gomez

Abstract. We present the AI weather and climate model intercomparison project (AIMIP), phase 1. Drawing from the rich tradition of intercomparisons in climate model development, we specify a common experiment, output data format, and training constraints (namely, training against historical reanalysis data) for AIMIP Phase 1 models. We aim to identify differences in modeling frameworks and AI architectural choices that influence model behavior, and build trust in AI weather and climate models through open data and evaluation. AIMIP Phase 1 models must simulate the atmosphere given specified historical sea surface temperatures over 1979–2024. We evaluate the models' performance using five major evaluation criteria: biases, trends, response to El Niño-related sea surface temperature anomalies, temporal variability, and out-of-sample generalization tests. We find that the AI models are able to simulate the historical climate and response to forcing as well as a conventional physically-based model, but some AI models underestimate historical warming trends, and their predictions diverge in the out-of-sample generalization tests. We describe the AIMIP Phase 1 dataset that is publicly available for additional evaluations.

Received: 19 May 2026 – Discussion started: 01 Jun 2026

Status: open (until 07 Aug 2026)

Post a comment Subscribe to comment alert

CEC1:
'Comment on egusphere-2026-2709 - No compliance with the policy of the journal', Juan Antonio Añel, 26 Jun 2026 reply

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
In your "Code and Data Availability" section you fail to provide repositories that we can accept for the code of the all the models used for your work. You refer to Table 1, and there only a few links to GitHub and noaa.gov sites are provided, which are not valid. For example, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo.
In addition, you have archived the data used and produced in your work in several sites that do not fulfil GMD’s requirements for a persistent data archive because:
- They do not appear to have a published policy for data preservation over many years or decades (some flexibility exists over the precise length of preservation, but the policy must exist).

- They do not appear to have a published mechanism for preventing authors from unilaterally removing material. Archives must have a policy which makes removal of materials only possible in exceptional circumstances and subject to an independent curatorial decision,

- They do not appear to issue a persistent identifier such as a DOI or Handle for each precise dataset.
If we have missed a published policy which does in fact address this matter satisfactorily, please post a response linking to it. If you have any questions about this issue, please post them in a reply.
The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. As your manuscript does not comply with this, it should not have been accepted for Discussions.
Please, therefore, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
Later, if the Topical Editor decides to continue with the review or publication process of your manuscript and you are requested to upload a new version of it, then The 'Code and Data Availability’ section of your manuscript must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Reply

Citation: https://doi.org/10.5194/egusphere-2026-2709-CEC1
- AC1:
  'Reply on CEC1', Brian Henn, 15 Jul 2026 reply
  
  Dear editor:
  Thank you for your comment. We have addressed the concerns in a new version of the manuscript that has now been posted to arXiv. All of the models' code repositories are now available via DOIs/Zenodo. All of the data required to reproduce the results in the manuscript is now similarly backed by a DOI/Zenodo entry. We are confident that with these additions the manuscript now complies with GMD's code and data policies and can proceed to review/discussion. If there is any disagreement in this regard please let us know as soon as possible.
  Thank you,
  Brian Henn
  
  Reply
  
  Citation: https://doi.org/10.5194/egusphere-2026-2709-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 16 Jul 2026 reply
    
    Dear authors,
    Thanks for your reply. Unfortunately, it does not solve the outstanding issues pointed out in my previous comment, and we can not consider that the new version complies with the policy of the journal.
    First, the Code and data availability section continues to do not comply with the policy of the journal, pointing to a table into the text and sites that, as we already mentioned, do not comply, such as pageo-data, which is a git site under cloud storage operated by a private company. Also, the link that you provide there is generic, and does not allow to identify the specific data files used in your work. Moreover, to access several pieces of code you cite previously published papers, instead of repositories, and such papers do not cite suitable repositories, but sites that are not acceptable according to the policy.
    Therefore, we must insist that you rewrite the Code and data availability policy for your manuscript with a list of repositories acceptable according to the policy of the journal and containing all the code and data used in your work.
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Reply
    
    Citation: https://doi.org/10.5194/egusphere-2026-2709-CEC2

Viewed

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 131 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
131	0	0	131	0	0

HTML: 131
PDF: 0
XML: 0
Total: 131
BibTeX: 0
EndNote: 0

Views and downloads (calculated since 01 Jun 2026)

Month	HTML	PDF	XML	Total
Jun 2026	73	0	73
Jul 2026	58	0	58

Cumulative views and downloads (calculated since 01 Jun 2026)

Month	HTML	PDF	XML	Total
Jun 2026	73	0	73
Jul 2026	58	0	58

Viewed (geographical distribution)

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 73 (including HTML, PDF, and XML) Thereof 73 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 16 Jul 2026

Short summary

AIMIP (AI Model Intercomparison Project) is a community effort to rigorously evaluate AI weather and climate models, which simulate Earth's climate with extraordinary efficiency compared to traditional systems. Phase 1 is an atmosphere-only standardized experiment, showing that AI models are competitive on average historical patterns but may struggle with long-term warming trends and generalizing to unseen scenarios. The AIMIP Phase 1 dataset is publicly available for open model evaluation.


Total:	0
HTML:	0
PDF:	0
XML:	0