the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
TorchClim v1.0: A deep-learning framework for climate model physics
Steven C. Sherwood
Abhnil Prasad
Kirill Trapeznikov
Jim Gimlett
Abstract. Climate models are hindered by the need to conceptualize and then parameterize complex physical processes that are not explicitly numerically resolved and for which no rigorous theory exists. Machine learning and artificial intelligence methods (ML/AI) offer a promising paradigm that can augment or replace the traditional parametrized approach with models trained on empirical process data. We offer a flexible and efficient framework, TorchClim, for inserting ML/AI physics surrogates that respect the parallelization of the climate model. A reference implementation of this approach is presented for the Community Earth System Model (CESM), where the authors substitute moist physics and radiative parametrization of the Community Atmospheric Model (CAM) with an ML/AI model. We show that a deep neural network surrogate trained on data from CAM itself can produce a stable model that reproduces the climate and variability of the original model, albeit with some biases. This framework is offered to the research community as an open-source project. The new framework seamlessly integrates into CAM's workflow and code-base and runs with negligible added computational cost, allowing rapid testing of various ML physics surrogates. The efficiency and flexibility of this framework open up new possibilities for using physics surrogates trained on offline data to improve climate model performance and better understand model physical processes.
- Preprint
(5253 KB) - Metadata XML
- BibTeX
- EndNote
David Fuchs et al.
Status: open (until 08 Dec 2023)
-
RC1: 'Comment on egusphere-2023-1954', Anonymous Referee #1, 14 Nov 2023
reply
The manuscript "TorchClim v1.0: A deep-learning framework for climate model physics" by David Fuchs, Steven C. Sherwood, Abhnil Prasad, Kirill Trapeznikov, and Jim Gimlett describes an application of a machine learning (ML) approach within the CAM AGCM. The aim of this study is to introduce an effective framework that can be adapted to implement ML approaches in climate models development. The sudy has a technical aspect and a climate model related aspect. My background is in climate dynamics and modelling. I cannot really comment on the computer technology aspects. The study addresses an important aspect in climate model development: how to include powerful ML approaches into model development. The manuscript should be considered for publication, but there are a number of aspects the authors should consider before publication.Â
Â
Main comments:
(*) Outcome: To summarise the manuscript: The study implements the TorchClim code of ML to improve one specific parameterisation in an AGCM. It is designed to be "plug-and-play", but to do so they actually have to heavily optimize the ML algorithm for the specific problem/model, which appears to be not "plug-and-play" at all. As an outcome the model is not as good in presenting the climate state and it is numerically slower than the original model. Given this outcome you would wonder: why would anyone want to use TorchClim?
The authors seem to be presenting a failed approach. While this is still helpful for the community, it is unlikely to motivate the readers to use this approach. The authors need to think about how this study can be more helpful for the reader.
(*) Technical aspects: The manuscript is primarily written for a computer science audience, and not as interesting for the climate model researchers. Given the journal being read mostly by climate researchers, I would recommend to strengthen the climate modelling aspects and reduce the technical aspects, as they appear less important. However, I may have missed the importance of the technical aspects. One way to do this may be to put the simulation results first and explain some of the technical aspect later or in an appendix.
(*) Insufficient results section (section 4): the discussion of the application results (figs. 6-9) is too short. It should be explained what experiments have been done, how long have they been running, what time interval after initialisation have been analysed. An Important aspects of an ML parameterisation is stability. So, running longer (several years) simulations without running into numerical instabilities, or unphysical states would be important to illustrate.Â
Other comment (in order as they appear in the text):
------------
line 14 "global circulation and climate models (GCMs)": GCM = Genreral Circulation Model------------
line 26 " For example, Fuchs et al. (2023) showed ...": A selfcitation is not a convincing argument. Can the authors find independent literature support?
------------
line 31-32 "Yet in some cases, the sheer number of parametrizations and versions of parametrizations in current GCMs could point to a fundamental problem with TP": Unclear what the authors want to say here. Why/How would "the sheer number of parametrizations" point to a fundamental problem with TP?
------------
line 46 "the curse of dimensionality": What is this?
------------
line 46-47 "... the increase in the computational complexity  of GCMs was met by  ... . The increase in computational complexity of GCMs was met by ... ": Repetition.Â
------------
line 57 "One way of doing this is to use observations to tune existing or new parametrizations": One would hope that observations are always part of parametrizations. Not sure what the authors want to point out here.
------------
Fig. 1: The logic and meaning of this diagram is unclear. It is also unclear why a climate modeller would or need to care about this.Â
------------
CPUs vs. GPUs: Why does a climate researcher need to care about these technical aspects? If this is mentioned in the manuscript, it needs a bit more explanation for non-computer science audience.
------------
section 2.2 and Fig. 2: It may help to start with the physics part and then go to the LibTorch block. At the end the authors write for climate researchers not for computer scientists. Or maybe both, but it still would help the climate researchers to get into this.
Â
------------
section 2.3 Â "Parallelization framework": Why is this important? CPU speed?ÂNaively, a climate modeller may think the sole purpose of the ML approach is to compute the parameters of a complex function and then the complex function is replacing the TP equation. Then why would I need this whole infrastructure here? It seems to be overkill.
Â------------
line 207 Â "... three categories": Â What is the important difference between the categories 2 and 3? They seem to be similar. Whether it is a dynamical or other kind of model does not appear to make a difference.Â------------
Table 1 and other sections: Lots of variable acronyms are presented but not explained (.e.g, what is FLNS?).
------------
line  229-230 "We note that a good starting point for training a surrogate model ... is simply the TP":ÂReally? Evidence for this? It could limit the design of the new model to be arbitrarily similar to a bad TP model.
------------
line 232-233 " .. the first is inserted in the TP before ... the second is inserted after that point.": Why? What is the purpose?Â
------------
Fig. 5: Not clear what this diagram is explaining.
Â
------------
line 239 "these data": Which data? Unclear.------------
line 252 "45e+6": should be 4.5 10^7. "e+6'" reads like computer code.Â
------------
Section 4.1, DNN model development: The description here sounds highly specific for the particular problem, with statements like "tested dozens of versions" you get the impression that this is not "plug-and-play". As the aim of the study seem to be to introduce a general approach, it would be helpful to see how this experience can be used to explain a general application. It does not seem straight forward.
Â
------------
line  265 "MSE":  not explained.------------
explanation of lost function, eq. in line 270: The lost function is not well explained. A number of terms play into this and later in the text more terms are mentioned and it is unclear how they relate to the equation in line 270.
Â
------------
line 266 "L2 regularization": What is this?
Â------------
line  296 "loss term": How does this related to eq. in line 270Â------------
line  301 "... examples of learning biasing Karniadakis et al. (2021)": unclear. Maybe: "examples of learning biasing following Karniadakis et al. (2021)"?---------- end -----------
Â
Citation: https://doi.org/10.5194/egusphere-2023-1954-RC1
David Fuchs et al.
Model code and software
Github repository David Fuchs https://github.com/dudek313/torchclim
David Fuchs et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
394 | 169 | 10 | 573 | 7 | 5 |
- HTML: 394
- PDF: 169
- XML: 10
- Total: 573
- BibTeX: 7
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1