Machine Learning for numerical weather and climate modelling: a review

de Burgh-Day, Catherine Odelia; Leeuwenburg, Tennessee

doi:https://doi.org/10.5194/egusphere-2023-350

Preprints

https://doi.org/10.5194/egusphere-2023-350

Preprints

17 Apr 2023

| 17 Apr 2023

Machine Learning for numerical weather and climate modelling: a review

Catherine Odelia de Burgh-Day and Tennessee Leeuwenburg

Abstract. Machine learning (ML) is increasing in popularity in the field of weather and climate modelling. Applications range from improved solvers and preconditioners, to parametrisation scheme emulation and replacement, and recently even to full ML-based weather and climate prediction models. While ML has been used in this space for more than 25 years, it is only in the last 10 or so years that progress has accelerated to the point that ML applications are becoming competitive with numerical knowledge-based alternatives. In this review, we provide a roughly chronological summary of the application of ML to aspects of weather and climate modelling from early publications through to the latest progress at the time of writing. We also provide an overview of key ML concepts and terms. Our aim is to provide a primer for researchers and model developers to rapidly familiarize and update themselves with the world of ML in the context of weather and climate models.

Received: 27 Feb 2023 – Discussion started: 17 Apr 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 714 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (714 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

14 Nov 2023

| Review and perspective paper

| Highlight paper

Machine learning for numerical weather and climate modelling: a review

Catherine O. de Burgh-Day and Tennessee Leeuwenburg

Geosci. Model Dev., 16, 6433–6477, https://doi.org/10.5194/gmd-16-6433-2023,https://doi.org/10.5194/gmd-16-6433-2023, 2023

Short summary Executive editor

Catherine Odelia de Burgh-Day and Tennessee Leeuwenburg

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-350', Anonymous Referee #1, 17 May 2023

General comments:
The authors review the application of Machine Learning (ML) techniques to weather and climate modelling with an emphasis on historical and current developments. A glossary of commonly used terms and basic introductions to some concepts are provided. An in-depth exchange of knowledge between the ML and geoscientific modelling communities could be immensely beneficial and reviews like this can be an important step to facilitate this exchange.

As far as I am able to judge, the authors do a great job at covering a wide range of relevant publications and explaining the questions tackled in many of these applications. In terms of presentation, the language is concise and the paper is enjoyable to read. Including tabular or schematic representations of ML concepts and/or the discussed applications could further improve the visual appeal of the paper. A stronger narrative thread linking the different subsections and applications would preempt the impression of reading through a long list of papers - although this may be unavoidable given the scope of the reviewed works.

My primary concern about the paper in its current form is its utility to aid researchers in the development of better geoscientific models. Due to the wide range of works that are being discussed, many concepts and models are only touched upon in brief, without further elaboration of the underlying principles and connections between different applications. References to methodological works that could support future model development are only sparsely included in the main text or the glossary. In my opinion, incorporating suitable methodological references into the glossary and introductory sections could greatly strengthen the paper!

Specific comments:
L20 - Isn’t there an ongoing research effort to extend numerical models to utilise GPU hardware?
L24 - What about improvements in subgrid parameterizations due to better process understanding?
L65/66 - Maybe include a reference? (e.g. McGovern et al 2019 [1])
L104 - Very debatable if this is a necessary requirement for e.g. a weather prediction model?
L116ff - A narrative thread linking these subsections would be much appreciated!
L128 - Could it be more useful to briefly discuss the utility of the individual references rather than providing a large list?
L136 - Debatable, as recent trends in ML point strongly in the opposite direction (i.e. larger homogenised models).
L145 - Debatable as emphasis shifts towards self-supervised learning and better training regimes rather than architectural developments!
L156 - It could be important to emphasise that NN are known to interpolate within the training envelope and may not generalise well outside it (in contrast to physical laws).
L163 - Sigmoid is a highly uncommon and suboptimal choice of activation function compared to ReLU! (ReLU is also missing from the glossary despite its ubiquity in modern models).
L179 - Why are Token-sequence and Transformer models listed separately? I don’t see the justification for this classification introduced as is.
L521 - ConvLSTM were introduced in 2015 also in the context of nowcasting, including a reference would be appropriate [2].
L537 - Why is Sonderby et al discussed if nowcasting is supposedly omitted (L515)? Why not Espeholt et al 2021? Why is this not discussed in the context of probabilistic models?
L560 - A background reference to GNN either here or in the glossary could be beneficial! (e.g. Battaglia et al 2018 [3])
L581 - I would strongly object to treating the models in this section (excluding Clare et al) as probabilistic in contrast to the ones in the previous section! These models are fundamentally deterministic, in contrast to e.g. generative models such as Ravuri et al or true probabilistic models like Sonderby et al. Discussing the different types of ensembling used in these models could be valuable on its own (also referring to Scher et al [4]).
L661 - A large part of the affordable training is the use of much lower resolution and not due to the architecture! (1.4deg vs 0.25)
L663 - Missing several extensions of WeatherBench (e.g. WeatherBenchProbability, RainBench) and ClimateBench.
L981 - The activation function is applied elementwise to the result of a matrix multiplication and does not incorporate multiplication or bias addition by definition.
L995 - Calling it a “complex” mechanism is not necessary. Typical Attention simply computes a dot product between vectors.
L1007 - Normalisation plays an essential role in modern NN and probably deserves its own glossary term. It does not need to be performed over the batch (i.e. LayerNorm)
L1028 - “Convolutional … sliding window” seems redundant.
L1037 - Not true in general! If the network is too thin it becomes highly unstable.

Technical corrections:
DNN and NN are introduced as separate abbreviations, but the distinction is not kept consistent nor does it appear beneficial.

References:
[1] McGovern, A., R. Lagerquist, D. John Gagne, G. E. Jergensen, K. L. Elmore, C. R. Homeyer, and T. Smith, 2019: Making the Black Box More Transparent: Understanding the Physical Implications of Machine Learning. Bull. Amer. Meteor. Soc., 100, 2175–2199, https://doi.org/10.1175/BAMS-D-18-0195.1.
[2] Shi, Xingjian, et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems 28 (2015).
[3] Battaglia, Peter W., et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
[4] Scher, Sebastian, and Gabriele Messori. "Ensemble methods for neural network‐based weather forecasts." Journal of Advances in Modeling Earth Systems 13.2 (2021).

Citation: https://doi.org/10.5194/egusphere-2023-350-RC1
- AC1: 'Reply on RC1', Catherine de Burgh-Day, 08 Jun 2023
  
  Thank you for your very helpful and constructive review. Please see the attached supplement for our responses.
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-AC1
RC2:
'Comment on egusphere-2023-350', Anonymous Referee #2, 09 Jun 2023
In “Machine Learning for numerical weather and climate modelling: a review,” de Burgh-Day and Leeuwenburg provide a review–aimed at weather/climate model developers–of machine learning itself, the history of its application in weather/climate modeling, and contemporary uses (and challenges) including: parameterization replacement, coarse-graining, superparameterization, fluid dynamics solvers, and others.
Overall the review is written clearly and written well for the intended audience, it is comprehensive with respect to the ML literature associated with weather & climate modeling, and it correctly and adequately summarizes the relevant literature. Overall I think this will be an invaluable addition to the literature, complementing some other recent reviews in the ML+weather/climate literature.
The three main places where, in my opinion, the review could use some improvement are: (1) addition of graphics/diagrams/tables/pseudocode/anything-to-please-break-up-the-text, (2) filling some gaps in the review, and (3) more synthesis. I also have some other minor improvements to suggest. Detailed comments follow.
*Note that two students also read this paper and provided input on the review.
Major feedback
Adding figures, diagrams, tables, pseudocode, etc.
The stated goal of the manuscript, “to provide a primer for researchers and model developers to rapidly familiarize and update themselves with the world of ML in the context of weather and climate models,” would be better-served if visual aids of some form were added to the manuscript, particularly in Sections 2 and 8–10, and possibly also elsewhere.
I’d also add that the current version of the paper is a lot of text without any interruption; I found that this made it difficult for me to hold my attention on the paper.
Some specific suggestions for figures/diagrams/etc. follow.
Section 2 Readers who are totally unfamiliar with machine learning may find it challenging to derive meaning from the text-only descriptions given in Section 2. I’m not necessarily advocating for yet another elementary neural network diagram, like one could find on wikipedia, but rather something that will help this specific audience–model developers–form a reasonably good mental model of neural networks, decision trees, and the various architectures of them. This could be in the form of a diagram, but this audience also might find it simpler to digest some pseudocode: e.g., pseudocode describing a neuron as a function, or pseudocode describing how a convolutional layer works. But then again, maybe a diagram would be better.
Sections 3–7 It could be useful to have a figure somewhere that gives a pie chart of the various ML/weather+climate modeling topics (e.g., how many papers, relatively, are in each of the categories outlined in the various sections.) Also, if any of the papers here are the authors’s own work, then perhaps it wouldn’t be to difficult to add a variant of a figure that already exists in the literature.
Also if any of the sources are published with a Creative Commons license, like many in GMD, then it should be acceptable to actually take figures from those papers as long as they are properly attributed following the guidelines. This could be a really simple way to break up the text and help the readers get a deeper glimpse into the work that has been done.
Section 8 It would be great to add a graphically-rich timeline that complements the list in section 8.1. I could imagine future authors using that timeline in ML presentations, which would be a great way to get free advertising for this paper.
Section 9 Maybe I don’t have an idea for a figure in this section after all.
Section 10 Consider adding a table of new advances in ML that haven’t yet been employed in weather/climate modeling, but that may be useful. Also/alternatively, consider adding a figure that somehow communicates the promising new directions. It could be something as simple as a PowerPoint SmartArt that simply adds some graphical elements to highlight the text.
Some gaps
Data Assimilation I’ve come across some literature on ML and data assimilation (some even in GMD), but the authors don’t discuss this at all here. Given that data assimilation is a critical component of weather modeling efforts, it would be a shame to overlook this. I suggest that the authors survey the literature on this.
Ice sheet modeling Unless I overlooked it, this review doesn’t discuss ice sheet modeling at all, which is an increasingly important component of CMIP-class models. A quick google scholar search on ‘ice sheet model machine learning’ turns up some apparently relevant results.
Integrated assessment modeling / multi-sector dynamics In the lifecycle of CMIP model efforts, generation of climate scenarios (like the SSPs) is a key step. This isn’t discussed at all. There are at some efforts in this area that are worth mentioning, and I’d guess there are others, e.g: https://www.osti.gov/biblio/1769796
AI Ethics AI advances have ethical implications, and I think there might be some here too. It might be worth surveying some of the recent literature on ethics in AI, with a goal of summarizing the main ethical issues that come up with AI in general and what the implications of these ethical issues might be for AI in weather and climate model development.
Synthesis
In my perspective, the most impactful review articles are ones that (a) provide a comprehensive overview of the state of the literature (which this paper does quite well), and (b) synthesize what the authors have learned: even suggesting new directions that might not be immediately evident. The current version of the manuscript does great on (a) but does not do too much with respect to (b). The Conclusions section does this to some extent (e.g., “Nonetheless, there are still many challenges to overcome…This list provides a set of focus areas for future research efforts.”), but the list focuses on challenges rather than promising new directions. The last paragraph starts to get at this with the sentence “Advances in the sophistication, complexity and efficiency of ML architectures are being heavily invested in…,” but the manuscript then stops short of discussing these new advances or how they might point to new directions.
I recommend revamping the last section to focus on this synthesis aspect.
I’m not in the best position to give good suggestions here since I don’t have as comprehensive of a knowledge of this literature as the reviewers, but after reading the paper, some untouched directions do come to mind:
More exploration of foundation models. The authors note one recent example of a foundation model. The proliferation of foundation models in the last year (ChatGPT, for example) has this at the forefront of a lot of people’s minds: what new research could contribute to the application/analysis/use of foundation models in weather and climate?

Relatedly, it could be impactful to somehow fuse weather / climate code and data with GPT-like models. What sort of impact would it have if a model developer user could get insight from an AI model that's able to ingest and interpret high-dimensional data as well as code: e.g., “WxGPT, why does the new change in commit 3efde6 result in a systematic cold bias in daytime maximum temperature forecasts?”

Model emulation / tuning. There’s some literature on groups using Gaussian process models to emulate climate models, where they use the emulators for quantifying uncertainty in tuning parameters and for finding optimal tunings; other ML methods could be useful here

Model spinup: a major barrier to use of ultra-high-resolution coupled climate models is the time required to spin-up the slow components of the system like ocean and land ice; ML methods could potentially be useful here (e.g., for learning how to translate equilibrated states from a low-res model to a high-res model)

3D radiative transfer for high-res models: possibly replacing a full 3D radiative transfer code with an ML approximation, or perhaps using the climate model for 1D radiative transfer and using ML to model the expected differences in fluxes due to 3D effects

Modeling full PDFs: cutting-edge models like FourCastNET essentially emulate what dynamical models do in that they provide deterministic (albeit presumably chaotic) states. What if instead they could be trained to output a PDF of states (e.g., emulating Fokker-Planck equations) rather than deterministic states? That would be something fundamentally new relative to existing model capabilities.

My main point here is that your review already has a lot of value in establishing what has already been done, and I think this paper will be more impactful if you increase the emphasis on what could plausibly be done that has not yet been touched. I recommend thinking 5-10 years into the future rather than just incremental advances based on what’s been done. You have the unique opportunity to inspire others to try some radical new ideas.
Also, consider that this review will very likely be cited in workshop reports that inform funding agency priorities. Your last sentence states ‘academic and operational agencies will need to continue to support research in this space;’ giving specific ideas here could really have an impact.
Finally, I recommend also mentioning AI ethics in this last section. If you’re going to inspire researchers to think radically, it would be responsible to also admonish people to always consider the ethical issues with as much mental effort as they do the technical issues. I can’t help but quote from Jurassic Park here: “Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”
Minor feedback
The NVIDIA group has just put up a preprint of the newest version of FourCastNet, which allows them to perform year-length simulations: https://arxiv.org/abs/2306.03838

(lines 8, 9) Please be consistent about the spelling of parameteriz(s)ation

In section 1, there are no references at all, even though there are numerous statements that would normally warrant references; this was quite distracting until I understood why. I now understand why this was done, since the references for those statements are given extensively in the sections that follow. My suggestion would be to make a statement early on in the introduction that states something like “In this introduction, we overview the state of machine learning in weather and climate research without providing references; we instead provide references for these statements in the detailed sections that follow.”

(line 20) “numerical weather and climate forecasts..are not amenable to transfer to specialized compute resources such as GPUs” … I’m not sure that’s strictly true. There’s been quite a lot of effort to refactor and port major codes to GPUs, demonstrating that it can be done in principle (for example consider the US Department of Energy’s E3SM / SCREAM model, which has a dynamical core that now runs on GPU; https://climatemodeling.science.energy.gov/technical-highlights/simple-cloud-resolving-e3sm-atmosphere-model-scream). It might be more accurate to say that it requires person-decades of effort to transfer these codes to GPUs.

(line 24) “improve the representation of sub grid-scale processes…a computationally costly exercise” <– this isn’t necessarily true. Yes, for something like boundary layer turbulence or aerosol physics, modeling higher order moments or doing bin microphysics is more costly. But for something like convection, improvements could come simply through better physics-based theories about how convection works.

(lines 106-107) “Furthermore, in many cases…the work is led by data scientists and ML researchers with limited expertise in weather and climate model evaluation” <– this wording risks alienating and insulting colleagues who have done work in this area who in fact have extensive expertise in weather and climate modeling. For example, consider research cited in this paper from the groups of Libby Barnes, Mike Pritchard, and Chris Bretherton – all three of them are definitively experts in weather and climate model evaluation. I suggest revising “in many cases” to “in some cases”.

(line 113) “A review of the application of, and progress in, ML in these areas would be of great value…” <– FYI, a review paper by Maria Molina was just accepted in the AMS journal AI4ES, titled “A Review of Recent and Emerging Machine Learning Applications for Climate Variability and Weather Phenomena.” If her paper appears online before this manuscript is finalized, I recommend citing it here. Full disclosure: I’m one of the authors of that paper.

(line 238) Was GCM defined as an acronym before this? (It’s defined later on line 253)

throughout the paper There is some odd formatting in the footnotes…they should probably be superscripts. Likewise, the dagger symbol, that indicates a vocab word defined in the glossary, should consistently be a superscript (sometimes it isn’t.)

one of my students comments, and I agree: “Personally speaking, I found the paper’s pace a bit choppy at times. For example, subsections 3.6-3.9, 5.1, 5.6, and 7.1-7.2 are only a paragraph long. Especially when these occurred back-to-back, the paper felt very “stop go stop go stop go”, made even worse when the sequential subsections had little to do with each other. I’m not entirely sure of a solution here, but I wish the authors could find a way to make these subsections flow more together, or at least give us a bit more time with them. It’s hard for me to digest their information when each paragraph is immediately moving on to something almost completely different. This could totally just be a me-thing though.”

Section 7.2: There’s a bit more work in this area than just Mudigonda et al. (2017). Here are a few additional relevant papers (again full disclosure: I’m a co-author on two of these):

Prabhat, Kashinath, K., Mudigonda, M., Kim, S., Kapp-Schwoerer, L., Graubner, A., et al. (2021). ClimateNet: an expert-labeled open dataset and deep learning architecture for enabling high-precision analyses of extreme weather. Geoscientific Model Development, 14(1), 107–124. https://doi.org/10.5194/gmd-14-107-2021
O’Brien, T. A., Risser, M. D., Loring, B., Elbashandy, A. A., Krishnan, H., Johnson, J., et al. (2020). Detection of atmospheric rivers with inline uncertainty quantification: TECA-BARD v1.0.1. Geoscientific Model Development, 13(12), 6131–6148. https://doi.org/10.5194/gmd-13-6131-2020
Rupe, A., Kashinath, K., Kumar, N., Crutchfield, J. (2023). Physics-Informed Representation Learning for Emergent Organization in Complex Dynamical Systems. arXiv. https://doi.org/10.48550/arXiv.2304.12586
Citation: https://doi.org/10.5194/egusphere-2023-350-RC2
- RC3: 'a quick addition', Anonymous Referee #2, 09 Jun 2023
  
  Also, related to the comment about additional papers using ML to identify weather objects, here's another paper:
  https://ai4earthscience.github.io/neurips-2020-workshop/papers/ai4earth_neurips_2020_55.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-RC3
- AC2: 'Reply on RC2', Catherine de Burgh-Day, 23 Jun 2023
  
  Thank you for your very helpful review. Your detailed suggestions and feedback were extremely useful, and we are working on incorporating them into the manuscript. Please see the attached supplement for our detailed inline responses.
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-350', Anonymous Referee #1, 17 May 2023

General comments:
The authors review the application of Machine Learning (ML) techniques to weather and climate modelling with an emphasis on historical and current developments. A glossary of commonly used terms and basic introductions to some concepts are provided. An in-depth exchange of knowledge between the ML and geoscientific modelling communities could be immensely beneficial and reviews like this can be an important step to facilitate this exchange.

As far as I am able to judge, the authors do a great job at covering a wide range of relevant publications and explaining the questions tackled in many of these applications. In terms of presentation, the language is concise and the paper is enjoyable to read. Including tabular or schematic representations of ML concepts and/or the discussed applications could further improve the visual appeal of the paper. A stronger narrative thread linking the different subsections and applications would preempt the impression of reading through a long list of papers - although this may be unavoidable given the scope of the reviewed works.

My primary concern about the paper in its current form is its utility to aid researchers in the development of better geoscientific models. Due to the wide range of works that are being discussed, many concepts and models are only touched upon in brief, without further elaboration of the underlying principles and connections between different applications. References to methodological works that could support future model development are only sparsely included in the main text or the glossary. In my opinion, incorporating suitable methodological references into the glossary and introductory sections could greatly strengthen the paper!

Specific comments:
L20 - Isn’t there an ongoing research effort to extend numerical models to utilise GPU hardware?
L24 - What about improvements in subgrid parameterizations due to better process understanding?
L65/66 - Maybe include a reference? (e.g. McGovern et al 2019 [1])
L104 - Very debatable if this is a necessary requirement for e.g. a weather prediction model?
L116ff - A narrative thread linking these subsections would be much appreciated!
L128 - Could it be more useful to briefly discuss the utility of the individual references rather than providing a large list?
L136 - Debatable, as recent trends in ML point strongly in the opposite direction (i.e. larger homogenised models).
L145 - Debatable as emphasis shifts towards self-supervised learning and better training regimes rather than architectural developments!
L156 - It could be important to emphasise that NN are known to interpolate within the training envelope and may not generalise well outside it (in contrast to physical laws).
L163 - Sigmoid is a highly uncommon and suboptimal choice of activation function compared to ReLU! (ReLU is also missing from the glossary despite its ubiquity in modern models).
L179 - Why are Token-sequence and Transformer models listed separately? I don’t see the justification for this classification introduced as is.
L521 - ConvLSTM were introduced in 2015 also in the context of nowcasting, including a reference would be appropriate [2].
L537 - Why is Sonderby et al discussed if nowcasting is supposedly omitted (L515)? Why not Espeholt et al 2021? Why is this not discussed in the context of probabilistic models?
L560 - A background reference to GNN either here or in the glossary could be beneficial! (e.g. Battaglia et al 2018 [3])
L581 - I would strongly object to treating the models in this section (excluding Clare et al) as probabilistic in contrast to the ones in the previous section! These models are fundamentally deterministic, in contrast to e.g. generative models such as Ravuri et al or true probabilistic models like Sonderby et al. Discussing the different types of ensembling used in these models could be valuable on its own (also referring to Scher et al [4]).
L661 - A large part of the affordable training is the use of much lower resolution and not due to the architecture! (1.4deg vs 0.25)
L663 - Missing several extensions of WeatherBench (e.g. WeatherBenchProbability, RainBench) and ClimateBench.
L981 - The activation function is applied elementwise to the result of a matrix multiplication and does not incorporate multiplication or bias addition by definition.
L995 - Calling it a “complex” mechanism is not necessary. Typical Attention simply computes a dot product between vectors.
L1007 - Normalisation plays an essential role in modern NN and probably deserves its own glossary term. It does not need to be performed over the batch (i.e. LayerNorm)
L1028 - “Convolutional … sliding window” seems redundant.
L1037 - Not true in general! If the network is too thin it becomes highly unstable.

Technical corrections:
DNN and NN are introduced as separate abbreviations, but the distinction is not kept consistent nor does it appear beneficial.

References:
[1] McGovern, A., R. Lagerquist, D. John Gagne, G. E. Jergensen, K. L. Elmore, C. R. Homeyer, and T. Smith, 2019: Making the Black Box More Transparent: Understanding the Physical Implications of Machine Learning. Bull. Amer. Meteor. Soc., 100, 2175–2199, https://doi.org/10.1175/BAMS-D-18-0195.1.
[2] Shi, Xingjian, et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems 28 (2015).
[3] Battaglia, Peter W., et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
[4] Scher, Sebastian, and Gabriele Messori. "Ensemble methods for neural network‐based weather forecasts." Journal of Advances in Modeling Earth Systems 13.2 (2021).

Citation: https://doi.org/10.5194/egusphere-2023-350-RC1
- AC1: 'Reply on RC1', Catherine de Burgh-Day, 08 Jun 2023
  
  Thank you for your very helpful and constructive review. Please see the attached supplement for our responses.
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-AC1
RC2:
'Comment on egusphere-2023-350', Anonymous Referee #2, 09 Jun 2023
In “Machine Learning for numerical weather and climate modelling: a review,” de Burgh-Day and Leeuwenburg provide a review–aimed at weather/climate model developers–of machine learning itself, the history of its application in weather/climate modeling, and contemporary uses (and challenges) including: parameterization replacement, coarse-graining, superparameterization, fluid dynamics solvers, and others.
Overall the review is written clearly and written well for the intended audience, it is comprehensive with respect to the ML literature associated with weather & climate modeling, and it correctly and adequately summarizes the relevant literature. Overall I think this will be an invaluable addition to the literature, complementing some other recent reviews in the ML+weather/climate literature.
The three main places where, in my opinion, the review could use some improvement are: (1) addition of graphics/diagrams/tables/pseudocode/anything-to-please-break-up-the-text, (2) filling some gaps in the review, and (3) more synthesis. I also have some other minor improvements to suggest. Detailed comments follow.
*Note that two students also read this paper and provided input on the review.
Major feedback
Adding figures, diagrams, tables, pseudocode, etc.
The stated goal of the manuscript, “to provide a primer for researchers and model developers to rapidly familiarize and update themselves with the world of ML in the context of weather and climate models,” would be better-served if visual aids of some form were added to the manuscript, particularly in Sections 2 and 8–10, and possibly also elsewhere.
I’d also add that the current version of the paper is a lot of text without any interruption; I found that this made it difficult for me to hold my attention on the paper.
Some specific suggestions for figures/diagrams/etc. follow.
Section 2 Readers who are totally unfamiliar with machine learning may find it challenging to derive meaning from the text-only descriptions given in Section 2. I’m not necessarily advocating for yet another elementary neural network diagram, like one could find on wikipedia, but rather something that will help this specific audience–model developers–form a reasonably good mental model of neural networks, decision trees, and the various architectures of them. This could be in the form of a diagram, but this audience also might find it simpler to digest some pseudocode: e.g., pseudocode describing a neuron as a function, or pseudocode describing how a convolutional layer works. But then again, maybe a diagram would be better.
Sections 3–7 It could be useful to have a figure somewhere that gives a pie chart of the various ML/weather+climate modeling topics (e.g., how many papers, relatively, are in each of the categories outlined in the various sections.) Also, if any of the papers here are the authors’s own work, then perhaps it wouldn’t be to difficult to add a variant of a figure that already exists in the literature.
Also if any of the sources are published with a Creative Commons license, like many in GMD, then it should be acceptable to actually take figures from those papers as long as they are properly attributed following the guidelines. This could be a really simple way to break up the text and help the readers get a deeper glimpse into the work that has been done.
Section 8 It would be great to add a graphically-rich timeline that complements the list in section 8.1. I could imagine future authors using that timeline in ML presentations, which would be a great way to get free advertising for this paper.
Section 9 Maybe I don’t have an idea for a figure in this section after all.
Section 10 Consider adding a table of new advances in ML that haven’t yet been employed in weather/climate modeling, but that may be useful. Also/alternatively, consider adding a figure that somehow communicates the promising new directions. It could be something as simple as a PowerPoint SmartArt that simply adds some graphical elements to highlight the text.
Some gaps
Data Assimilation I’ve come across some literature on ML and data assimilation (some even in GMD), but the authors don’t discuss this at all here. Given that data assimilation is a critical component of weather modeling efforts, it would be a shame to overlook this. I suggest that the authors survey the literature on this.
Ice sheet modeling Unless I overlooked it, this review doesn’t discuss ice sheet modeling at all, which is an increasingly important component of CMIP-class models. A quick google scholar search on ‘ice sheet model machine learning’ turns up some apparently relevant results.
Integrated assessment modeling / multi-sector dynamics In the lifecycle of CMIP model efforts, generation of climate scenarios (like the SSPs) is a key step. This isn’t discussed at all. There are at some efforts in this area that are worth mentioning, and I’d guess there are others, e.g: https://www.osti.gov/biblio/1769796
AI Ethics AI advances have ethical implications, and I think there might be some here too. It might be worth surveying some of the recent literature on ethics in AI, with a goal of summarizing the main ethical issues that come up with AI in general and what the implications of these ethical issues might be for AI in weather and climate model development.
Synthesis
In my perspective, the most impactful review articles are ones that (a) provide a comprehensive overview of the state of the literature (which this paper does quite well), and (b) synthesize what the authors have learned: even suggesting new directions that might not be immediately evident. The current version of the manuscript does great on (a) but does not do too much with respect to (b). The Conclusions section does this to some extent (e.g., “Nonetheless, there are still many challenges to overcome…This list provides a set of focus areas for future research efforts.”), but the list focuses on challenges rather than promising new directions. The last paragraph starts to get at this with the sentence “Advances in the sophistication, complexity and efficiency of ML architectures are being heavily invested in…,” but the manuscript then stops short of discussing these new advances or how they might point to new directions.
I recommend revamping the last section to focus on this synthesis aspect.
I’m not in the best position to give good suggestions here since I don’t have as comprehensive of a knowledge of this literature as the reviewers, but after reading the paper, some untouched directions do come to mind:
More exploration of foundation models. The authors note one recent example of a foundation model. The proliferation of foundation models in the last year (ChatGPT, for example) has this at the forefront of a lot of people’s minds: what new research could contribute to the application/analysis/use of foundation models in weather and climate?

Relatedly, it could be impactful to somehow fuse weather / climate code and data with GPT-like models. What sort of impact would it have if a model developer user could get insight from an AI model that's able to ingest and interpret high-dimensional data as well as code: e.g., “WxGPT, why does the new change in commit 3efde6 result in a systematic cold bias in daytime maximum temperature forecasts?”

Model emulation / tuning. There’s some literature on groups using Gaussian process models to emulate climate models, where they use the emulators for quantifying uncertainty in tuning parameters and for finding optimal tunings; other ML methods could be useful here

Model spinup: a major barrier to use of ultra-high-resolution coupled climate models is the time required to spin-up the slow components of the system like ocean and land ice; ML methods could potentially be useful here (e.g., for learning how to translate equilibrated states from a low-res model to a high-res model)

3D radiative transfer for high-res models: possibly replacing a full 3D radiative transfer code with an ML approximation, or perhaps using the climate model for 1D radiative transfer and using ML to model the expected differences in fluxes due to 3D effects

Modeling full PDFs: cutting-edge models like FourCastNET essentially emulate what dynamical models do in that they provide deterministic (albeit presumably chaotic) states. What if instead they could be trained to output a PDF of states (e.g., emulating Fokker-Planck equations) rather than deterministic states? That would be something fundamentally new relative to existing model capabilities.

My main point here is that your review already has a lot of value in establishing what has already been done, and I think this paper will be more impactful if you increase the emphasis on what could plausibly be done that has not yet been touched. I recommend thinking 5-10 years into the future rather than just incremental advances based on what’s been done. You have the unique opportunity to inspire others to try some radical new ideas.
Also, consider that this review will very likely be cited in workshop reports that inform funding agency priorities. Your last sentence states ‘academic and operational agencies will need to continue to support research in this space;’ giving specific ideas here could really have an impact.
Finally, I recommend also mentioning AI ethics in this last section. If you’re going to inspire researchers to think radically, it would be responsible to also admonish people to always consider the ethical issues with as much mental effort as they do the technical issues. I can’t help but quote from Jurassic Park here: “Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”
Minor feedback
The NVIDIA group has just put up a preprint of the newest version of FourCastNet, which allows them to perform year-length simulations: https://arxiv.org/abs/2306.03838

(lines 8, 9) Please be consistent about the spelling of parameteriz(s)ation

In section 1, there are no references at all, even though there are numerous statements that would normally warrant references; this was quite distracting until I understood why. I now understand why this was done, since the references for those statements are given extensively in the sections that follow. My suggestion would be to make a statement early on in the introduction that states something like “In this introduction, we overview the state of machine learning in weather and climate research without providing references; we instead provide references for these statements in the detailed sections that follow.”

(line 20) “numerical weather and climate forecasts..are not amenable to transfer to specialized compute resources such as GPUs” … I’m not sure that’s strictly true. There’s been quite a lot of effort to refactor and port major codes to GPUs, demonstrating that it can be done in principle (for example consider the US Department of Energy’s E3SM / SCREAM model, which has a dynamical core that now runs on GPU; https://climatemodeling.science.energy.gov/technical-highlights/simple-cloud-resolving-e3sm-atmosphere-model-scream). It might be more accurate to say that it requires person-decades of effort to transfer these codes to GPUs.

(line 24) “improve the representation of sub grid-scale processes…a computationally costly exercise” <– this isn’t necessarily true. Yes, for something like boundary layer turbulence or aerosol physics, modeling higher order moments or doing bin microphysics is more costly. But for something like convection, improvements could come simply through better physics-based theories about how convection works.

(lines 106-107) “Furthermore, in many cases…the work is led by data scientists and ML researchers with limited expertise in weather and climate model evaluation” <– this wording risks alienating and insulting colleagues who have done work in this area who in fact have extensive expertise in weather and climate modeling. For example, consider research cited in this paper from the groups of Libby Barnes, Mike Pritchard, and Chris Bretherton – all three of them are definitively experts in weather and climate model evaluation. I suggest revising “in many cases” to “in some cases”.

(line 113) “A review of the application of, and progress in, ML in these areas would be of great value…” <– FYI, a review paper by Maria Molina was just accepted in the AMS journal AI4ES, titled “A Review of Recent and Emerging Machine Learning Applications for Climate Variability and Weather Phenomena.” If her paper appears online before this manuscript is finalized, I recommend citing it here. Full disclosure: I’m one of the authors of that paper.

(line 238) Was GCM defined as an acronym before this? (It’s defined later on line 253)

throughout the paper There is some odd formatting in the footnotes…they should probably be superscripts. Likewise, the dagger symbol, that indicates a vocab word defined in the glossary, should consistently be a superscript (sometimes it isn’t.)

one of my students comments, and I agree: “Personally speaking, I found the paper’s pace a bit choppy at times. For example, subsections 3.6-3.9, 5.1, 5.6, and 7.1-7.2 are only a paragraph long. Especially when these occurred back-to-back, the paper felt very “stop go stop go stop go”, made even worse when the sequential subsections had little to do with each other. I’m not entirely sure of a solution here, but I wish the authors could find a way to make these subsections flow more together, or at least give us a bit more time with them. It’s hard for me to digest their information when each paragraph is immediately moving on to something almost completely different. This could totally just be a me-thing though.”

Section 7.2: There’s a bit more work in this area than just Mudigonda et al. (2017). Here are a few additional relevant papers (again full disclosure: I’m a co-author on two of these):

Prabhat, Kashinath, K., Mudigonda, M., Kim, S., Kapp-Schwoerer, L., Graubner, A., et al. (2021). ClimateNet: an expert-labeled open dataset and deep learning architecture for enabling high-precision analyses of extreme weather. Geoscientific Model Development, 14(1), 107–124. https://doi.org/10.5194/gmd-14-107-2021
O’Brien, T. A., Risser, M. D., Loring, B., Elbashandy, A. A., Krishnan, H., Johnson, J., et al. (2020). Detection of atmospheric rivers with inline uncertainty quantification: TECA-BARD v1.0.1. Geoscientific Model Development, 13(12), 6131–6148. https://doi.org/10.5194/gmd-13-6131-2020
Rupe, A., Kashinath, K., Kumar, N., Crutchfield, J. (2023). Physics-Informed Representation Learning for Emergent Organization in Complex Dynamical Systems. arXiv. https://doi.org/10.48550/arXiv.2304.12586
Citation: https://doi.org/10.5194/egusphere-2023-350-RC2
- RC3: 'a quick addition', Anonymous Referee #2, 09 Jun 2023
  
  Also, related to the comment about additional papers using ML to identify weather objects, here's another paper:
  https://ai4earthscience.github.io/neurips-2020-workshop/papers/ai4earth_neurips_2020_55.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-RC3
- AC2: 'Reply on RC2', Catherine de Burgh-Day, 23 Jun 2023
  
  Thank you for your very helpful review. Your detailed suggestions and feedback were extremely useful, and we are working on incorporating them into the manuscript. Please see the attached supplement for our detailed inline responses.
  
  Citation: https://doi.org/10.5194/egusphere-2023-350-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Catherine de Burgh-Day on behalf of the Authors (21 Jul 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (25 Jul 2023) by Paul Ullrich

RR by Anonymous Referee #2 (29 Jul 2023)

RR by Anonymous Referee #1 (31 Aug 2023)

ED: Publish as is (12 Sep 2023) by Paul Ullrich

ED: Publish as is (12 Sep 2023) by David Ham (Executive editor)

AR by Catherine de Burgh-Day on behalf of the Authors (21 Sep 2023)

Journal article(s) based on this preprint

14 Nov 2023

| Review and perspective paper

| Highlight paper

Machine learning for numerical weather and climate modelling: a review

Catherine O. de Burgh-Day and Tennessee Leeuwenburg

Geosci. Model Dev., 16, 6433–6477, https://doi.org/10.5194/gmd-16-6433-2023,https://doi.org/10.5194/gmd-16-6433-2023, 2023

Short summary Executive editor

Catherine Odelia de Burgh-Day and Tennessee Leeuwenburg

Viewed

Total article views: 2,158 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,131	995	32	2,158	17	12

HTML: 1,131
PDF: 995
XML: 32
Total: 2,158
BibTeX: 17
EndNote: 12

Views and downloads (calculated since 17 Apr 2023)

Month	HTML	PDF	XML	Total
Apr 2023	291	160	6	457
May 2023	173	107	4	284
Jun 2023	165	100	9	274
Jul 2023	116	89	3	208
Aug 2023	122	132	1	255
Sep 2023	118	152	3	273
Oct 2023	112	166	4	282
Nov 2023	34	89	2	125
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 17 Apr 2023)

Month	HTML	PDF	XML	Total
Apr 2023	291	160	6	457
May 2023	173	107	4	284
Jun 2023	165	100	9	274
Jul 2023	116	89	3	208
Aug 2023	122	132	1	255
Sep 2023	118	152	3	273
Oct 2023	112	166	4	282
Nov 2023	34	89	2	125
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Total article views: 2,127 (including HTML, PDF, and XML) Thereof 2,127 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 03 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (714 KB)
Metadata XML

Machine Learning is a rapidly expanding technique in the field of weather and climate modelling. This paper takes stock of the state of the field at the present time, and will be invaluable to participants across the field and beyond who wish to understand the impact of Machine Learning on the field, its limitations, and current scope.

Short summary

Machine learning (ML) is an increasingly popular tool in the field of weather and climate modelling. It has been used to improve many components of these models, and even the entire model. While ML has been used in this space for a long time, it is only recently that ML approaches have become competitive with more traditional approaches. In this review, we have summarized the use of ML in weather and climate modelling over time, and have also provided an overview of key ML concepts and terms.

Machine Learning for numerical weather and climate modelling: a review

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)

Cited

2 citations as recorded by crossref.


Total:	0
HTML:	0
PDF:	0
XML:	0