Can AI be enabled to dynamical downscaling? A Latent Diffusion Model to mimic km-scale COSMO5.0_CLM9 simulations
Abstract. Downscaling techniques are one of the most prominent applications of Deep Learning (DL) in Earth System Modeling. A robust DL downscaling model can generate high-resolution fields from coarse-scale numerical model simulations, saving the timely and resourceful applications of regional/local models. Additionally, generative DL models have the potential to provide uncertainty information, by generating ensemble-like scenario pools, a task that is computationally prohibitive for traditional numerical simulations. In this study, we apply a Latent Diffusion Model (LDM) to downscale ERA5 data over Italy up to a resolution of 2 km. The high-resolution target data consists of 2-m temperature and 10-m horizontal wind components from a dynamical downscaling performed with COSMO_CLM. Our goal is to demonstrate that recent advancements in generative modeling enable DL to deliver results comparable to those of numerical dynamical models, given the same input data, preserving the realism of fine-scale features and flow characteristics. A selection of predictors from ERA5 is used as input to the LDM, and a residual approach against a reference UNET is leveraged in applying the LDM. The performance of the generative LDM is compared with reference baselines of increasing complexity: quadratic interpolation of ERA5, a UNET, and a Generative Adversarial Network (GAN) built on the same reference UNET. Results highlight the improvements introduced by the LDM architecture and the residual approach over these baselines. The models are evaluated on a yearly test dataset, assessing the models' performance through deterministic metrics, spatial distribution of errors, and reconstruction of frequency and power spectra distributions.
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2646', Anonymous Referee #1, 08 Oct 2024
General Comments
This study explores the use of Latent Diffusion Models (LDM_res) for downscaling low-resolution meteorological data, comparing it with UNET and GAN models. They Use ERA5 reanalysis data as input and COSMO_CLM high-resolution data as reference truth. The LDM_res model incorporates a residual approach, allowing it to focus on small-scale features while using a simpler model for large-scale patterns. The results show that LDM_res outperforms other models in reconstructing fine-scale details, spatial distributions, and frequency distributions, particularly for wind speed and temperature. A key innovation lies in the model's use of diffusion processes combined with the residual framework, which enhances accuracy while maintaining computational efficiency compared to traditional numerical models like COSMO_CLM. However, some questions remain regarding the experimental design and the analysis of results. Further analysis and clarification are required before publication.
Specific Comments
- The objective of applying LDM for dynamical downscaling is timely and relevant, but it should be made clearer to a general audience. Explicitly outline what key challenges the LDM addresses that previous models have struggled with. Highlight the main contributions more prominently in the abstract and methods sections. Emphasize the specific advancements over existing downscaling techniques, particularly the practical significance of LDM compared to UNET and GAN approaches.
- The paper provides a detailed account of the architectures, but more information is needed on hyperparameter selection. Clarify how hyperparameters were optimized for each model to ensure a fair comparison. For instance, how is the timestep embedding in Figure 4 implemented? The residual approach is a strong point, but a deeper explanation of why the residual method was adopted would strengthen the paper. What specific insights led to using a residual method, and how does it enhance the model's training stability?
- The comparisons with UNET, GAN, and interpolation are useful, but the study lacks a broader context regarding recent advancements. It would improve the paper if comparisons with other state-of-the-art downscaling methods were included, such as recent transformer-based approaches or hybrid models that combine CNNs with probabilistic methods.
- The study used 21 years of data, but the explanation regarding "70% for training, 15% for validation, and 5% for testing (corresponding to 15, 3, and 1 year, respectively)" is unclear. Is the one year of test data randomly selected from a specific year, or does it include a few days from each season across multiple years? If only data from a particular year was used, it may not account for interannual variations. Additionally, have factors such as La Niña effects been considered in analyzing the temperature and wind fields?
- The evaluation is thorough, but it could be strengthened by including the following: A case study of an extreme weather event to showcase the model's performance in challenging scenarios. A discussion of the model’s performance across different seasons within the study area to illustrate robustness in various conditions.
Minor Comments
- In Figure 6, the second column zooms in on Sardinia Island. Consider resizing this image to match the first column, as this would make the figure more aesthetically pleasing.
- The future work section is informative. It would be beneficial to add some thoughts on how this approach could be integrated into existing modeling frameworks or operational systems, which could help bridge the gap between research and practical applications.
- Wind speed downscaling is evidently more challenging than temperature, and while some explanations are given, providing references would lend more credibility to the discussion. Additionally, consider exploring whether other variables, such as boundary layer height or atmospheric vorticity, could be used to improve downscaling performance for wind speed.
Citation: https://doi.org/10.5194/egusphere-2024-2646-RC1 -
AC1: 'Reply on RC1', Elena Tomasi, 13 Nov 2024
Dear Anonymous referee, thank you for your comments, suggestions, and participation in the open discussion. We are actively working on the revisions and will upload our replies to your comments as soon as possible, within the next 4-weeks as expected by the GMD review process. Kind regards, Elena
Citation: https://doi.org/10.5194/egusphere-2024-2646-AC1
-
RC2: 'Comment on egusphere-2024-2646', Michael Langguth, 07 Nov 2024
The present study ‘Can AI be enabled to dynamical downscaling? A Latent Diffusion Model to mimic km-scale COSMO5.0_CLM9 simulations’ explores the application of advanced statistical downscaling techniques with deep neural networks to emulate a high-resolved reanalysis product. In particular, the authors adapt a Latent Diffusion Model (LDM), originally by proposed Leinonen et al., (2023) [1] for precipitation nowcasting, to downscale ERA5 reanalysis data, generating 2m temperature and 10m-wind fields at 2 km-resolution. The target data is deduced from the Italian Very High-Resolution Resolution Reanalysis produced with COSMO_CLM9 (VHR-REA CCLM).
The performance of their LDM is compared against two baseline models, a standard U-Net and a Patch Generative Adversarial Network (GAN). Results reveal that the LDM, when combined with the two-step corrector method from Mardani et al. (2024) [2], outperforms these baselines in capturing fine-scale spatial features and in reproducing the frequency distribution of the high resolved target reanalysis product. Furthermore, the LDM framework is argued to be more efficient than diffusion models applied to raw data in physical space, while alleviating issues with GAN-based downscaling models, such as training instability and mode collapse.
The authors present the motivation for their model architecture, the experimental design and a comprehensive evaluation in a clear and well-structured way. The study adheres to high scientific standards and offers valuable insights into the efficient application of diffusion models for statistical downscaling of meteorological data. Nevertheless, minor revisions are recommended before the manuscript is published in GMD.
These comprise:- The study’s title and the introduction question whether deep neural networks can efficiently emulate dynamical downscaling. However, the authors employ an image-to-image approach with seperate models for the 2m temperature and the 10m-wind. This approach challenges temporal and inter-variable consistency - an aspect inherently maintained with dynamical downscaling using on physics-based numerical models. The authors should address this aspect in the Introduction and discuss future prospects in their Conclusion to align with the scope of the paper. Referring to the recent study by Mardani et al. (2024) [2], which presents promising results for multi-variate downscaling, would strengthen the discussion.
- In Section 2, the authors mention that the LDM can be applied to larger domains during inference compared to the training step. While this flexibility is appealing, spatial dimensionality constraints on the input data are common for U-Nets due to downsampling operations. For example, three average pooling operations with a factor of 2 would require that input dimensions of the U-Net must be multiplier of 8. Do these constraints also apply to the Denoiser network in the LDM? Additionally, Figure 5 indicates that the input data comprises 512x512 grid points during the inference. Isn’t the model evaluated on the larger target domain with 576x672 grid points as mentioned in the text? A brief assessment of any performance degradation, such as in RMSE, when applied to the full target domain would be insightful.
- The spatial resolution of the ERA5 input data is increased to 16 km for data alignment. How does this impact the input data quality (e.g. potential aliasing effects)? Can this affect or degrade the downscaling process? Please provide a brief explanation on the necessity for this adjustment.
- Section 4 describes the tested deep neural network architectures. However, the authors should provide more details on the training process, i.e. learning rate schedule, configuration of the optimizer, number of training epochs and other relevant hyperparameters. The details could be included in the Appendix and would help to ensure reproducibility of the findings.
- Section 5.4 provides valuable insight into the reconstruction of the frequency distribution of the target data. However, the results are rather qualitative. I suggest to quantify the difference, in particular with the Integrated Quadratic Distance, a proper divergence score on the underlying cumulative Distribution function (CDF) of the data, see Thorarinsdottir et al. (2013) [3].
- Similarly, the Power Spectra Analysis can be further quantified by calculating the Radially Averaged Log Spectral Distance (RALSD)-score, see Eq. 8 in Harris et al. (2022) [4].
- The effect of the LDM approach on the fine-scale variability via the Variational Autoencoder to compress and decompress the high-resolved data is very interesting and relevant for the chosen framework. Also consider to refer to the effective resolution of numerical model data, see e.g. Skamarock et al. (2014) [5]. It is suggested to add tis analysis to the main text. To compensate for the increased length of the text, parts of the model architecture description may be moved to the Appendix (together with the details on the training and hyperparameters mentioned above)
- The Section on Future Work is very short and more details on the various mentioned prospects mentioned (temporal and inter-variable consistency, precipitation downscaling, ensemble generation) should be presented.
References:
[1] Leinonen, Jussi, et al. “Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification.” arXiv preprint arXiv:2304.12891 (2023).
[2] Mardani, Morteza, et al. “Residual Diffusion Modeling for Km-scale Atmospheric Downscaling.” (2024).
[3] Thorarinsdottir, Thordis L., Tilmann Gneiting, and Nadine Gissibl. “Using proper divergence functions to evaluate climate models.” SIAM/ASA Journal on Uncertainty Quantification 1.1 (2013): 522-534.
[4] Harris, Lucy, et al. “A generative deep learning approach to stochastic downscaling of precipitation forecasts.” Journal of Advances in Modeling Earth Systems 14.10 (2022): e2022MS003120.
[5] Skamarock, William C., et al. “Atmospheric kinetic energy spectra from global high-resolution nonhydrostatic simulations.” Journal of the Atmospheric Sciences 71.11 (2014): 4369-4381.Citation: https://doi.org/10.5194/egusphere-2024-2646-RC2 -
AC2: 'Reply on RC2', Elena Tomasi, 13 Nov 2024
Dear Referee, thank you for your comments, suggestions, and participation in the open discussion. We are actively working on the revisions and will upload our replies to your comments as soon as possible, within the next 4-weeks as expected by the GMD review process. Kind regards, Elena
Citation: https://doi.org/10.5194/egusphere-2024-2646-AC2
Data sets
Sample dataset for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12934521
2000–2002 Dataset [1/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12944960
2003–2005 Dataset [2/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945014
2006–2008 Dataset [3/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945028
2009–2011 Dataset [4/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945040
2012–2014 Dataset [5/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945050
2015–2017 Dataset [6/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945058
2018–2020 Dataset [7/7] for the models trained and tested in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12945066
Pretrained models presented in the paper 'Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy' Elena Tomasi, Gabriele Franch, and Marco Cristoforetti https://doi.org/10.5281/zenodo.12941117
Model code and software
LDM_res v1.0 Gabriele Franch, Elena Tomasi, and Marco Cristoforetti https://doi.org/10.5281/zenodo.13356322
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
122 | 0 | 1 | 123 | 0 | 0 |
- HTML: 122
- PDF: 0
- XML: 1
- Total: 123
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1