the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Machine Learning Parameterization of the Multi-scale Kain-Fritsch (MSKF) Convection Scheme and stable simulation coupled in WRF using WRF-ML v1.0
Abstract. Warm-sector heavy rainfall often occurs along the coast of South China, and it is usually localized and long-lasting, making it challenging to predict. High-resolution numerical weather prediction (NWP) models are increasingly used to better resolve topographic features and forecast such high-impact weather events. However, when the grid spacing becomes comparable to the length scales of convection, known as the gray zone, the turbulent eddies in the atmospheric boundary layer are only partially resolved and parameterized to some extent. Whether using a convection parameterization (CP) scheme in the gray zone remains controversial. Scale-aware CP schemes are developed to enhance the representation of convective transport within the gray zone. The multi-scale Kain-Fritsch (MSKF) scheme includes modifications that allow for its effective implementation at a grid resolution as high as 2 km. In recent years, there has been an increasing application of machine learning (ML) models to various domains of atmospheric sciences, including the replacement of physical parameterizations with ML models. This work proposes a multi-output bidirectional long short-term memory (Bi-LSTM) model as a replace the scale-aware MSKF CP scheme. The Weather Research and Forecast (WRF) model is used to generate training and testing data over South China at a horizontal resolution of 5 km. Furthermore, the WRF model is coupled with the ML based CP scheme and compared with WRF simulations with original MSKF scheme. The results demonstrate that the Bi-LSTM model can achieve high accuracy, indicating the potential use of ML models to substitute the MSKF scheme in the gray zone.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(4224 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4224 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1967', Anonymous Referee #1, 05 Dec 2023
This manuscript develops and evaluates a ML-based surrogate for the MSKF convection scheme in WRF. The text is well written and the approach seems novel. I am not an expert on ML, so my comments are all on the atmospheric modeling/parameterization side of the work. My major comments are as follows,
-
Why is the goal to emulate what a conventional convection parameterization does in the first place? I ask because the ML scheme presented in the paper seems designed and trained to emulate MSKF performance at 5-km resolution. Why not just run MSKF directly? What's the need to run this ML-based emulator? Is it cheaper? Is it better? Please elaborate a little more on that.
-
Is the performance of MSKF good for the area and cases the authors are interested in? If so the authors should provide good evidence for it.
Minor comments (mainly on the introduciton part, which is otherwise quite well-written) follow,
-
Line 37: "Nevertheless ... These conflicting findings typically...", Did the Schwartz paper also use CP in their simulations or not? Either way I don't quite see the conflicting part here ...
-
Line 102: "Furthermore, all previous studies have predominantly focused on using CP schemes in GCM models for climate forecasting. Moreover, the choice of CP schemes significantly influences the uncertainty in precipitation forecasts within weather forecasting models. The complexity of the CP schemes also surpasses those applied in climate models (Arakawa, 2004)." I don't think the last statement is generally true. Also the logic doesn't seem to flow among these few sentences.
Citation: https://doi.org/10.5194/egusphere-2023-1967-RC1 - AC1: 'Reply on RC1', Xiaohui Zhong, 05 Feb 2024
-
-
RC2: 'Comment on egusphere-2023-1967', Anonymous Referee #2, 08 Jan 2024
This study provides an ML multiscale Kain-Fritsch convection parameterization and applies it into WRF. The results indicate that the ML parameterization yields comparable outcomes to the original WRF simulations. However, the motivation behind this study is unclear, and the evaluation of the ML parameterization lacks thoroughness. As a result, I recommend a major revision to address these issues.
Major comments:
- This study focuses on the development of a new machine learning (ML) convection parameterization. However, the training dataset used in this study is derived from the WRF simulation, which may not provide significant benefits for the ML parameterization. The dataset essentially serves as a surrogate for the old parameterization. Instead, a more suitable choice for the dataset could be from a super-parameterization or cloud resolve model, as it could help reduce uncertainties and enhance the ML parameterization's performance.
- It would be beneficial to conduct a comprehensive evaluation of the ML convection parameterization, considering factors beyond just the mean state. In addition to assessing the accuracy of tendencies, it is important to evaluate the parameterization's ability to accurately represent triggered convection. Furthermore, examining the diurnal cycle of precipitation and interpretability of the ML parameterization can provide valuable insights. Additionally, when coupling machine learning parameterization with dynamics, it is crucial to investigate potential issues with simulation stability. Have you tested the parameterization for longer simulations to assess its stability over extended time periods?
- In this study, the machine learning approach performs multi-task learning, simultaneously handling trigger function classification and tendencies regression. It would be valuable to conduct an ablation study, where the classification and regression tasks are separately evaluated and compared with the multi-task learning approach. This analysis can provide insights into the individual contributions and effectiveness of each task, helping to further understand the benefits and limitations of the multi-task learning approach.
- In Figure 4, is it correct that r^2 is the coefficient of determination? It seems unusual that although the points in (a) are widely scattered, the r^2 value is very high. Also, while the RMSE difference between (a) and (b) is large, their r^2 values are quite close.
Minor comments:
- Line 145: Could you please explain the rationale for selecting these particular variables?
- Line 170: could you explain the trigger condition base on lifting condensation level (LCL), convective available potential energy (CAPE), cloud top and base heights, and entrainment rates?
- Line 45: ‘hsave’ to ‘have’
- Caption in Figure 1: ‘5’ to ‘5°’
- Line 149: it could be better to add these 4 variables into Table 1.
- Line 182: remove ‘data’
Citation: https://doi.org/10.5194/egusphere-2023-1967-RC2 - AC2: 'Reply on RC2', Xiaohui Zhong, 05 Feb 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1967', Anonymous Referee #1, 05 Dec 2023
This manuscript develops and evaluates a ML-based surrogate for the MSKF convection scheme in WRF. The text is well written and the approach seems novel. I am not an expert on ML, so my comments are all on the atmospheric modeling/parameterization side of the work. My major comments are as follows,
-
Why is the goal to emulate what a conventional convection parameterization does in the first place? I ask because the ML scheme presented in the paper seems designed and trained to emulate MSKF performance at 5-km resolution. Why not just run MSKF directly? What's the need to run this ML-based emulator? Is it cheaper? Is it better? Please elaborate a little more on that.
-
Is the performance of MSKF good for the area and cases the authors are interested in? If so the authors should provide good evidence for it.
Minor comments (mainly on the introduciton part, which is otherwise quite well-written) follow,
-
Line 37: "Nevertheless ... These conflicting findings typically...", Did the Schwartz paper also use CP in their simulations or not? Either way I don't quite see the conflicting part here ...
-
Line 102: "Furthermore, all previous studies have predominantly focused on using CP schemes in GCM models for climate forecasting. Moreover, the choice of CP schemes significantly influences the uncertainty in precipitation forecasts within weather forecasting models. The complexity of the CP schemes also surpasses those applied in climate models (Arakawa, 2004)." I don't think the last statement is generally true. Also the logic doesn't seem to flow among these few sentences.
Citation: https://doi.org/10.5194/egusphere-2023-1967-RC1 - AC1: 'Reply on RC1', Xiaohui Zhong, 05 Feb 2024
-
-
RC2: 'Comment on egusphere-2023-1967', Anonymous Referee #2, 08 Jan 2024
This study provides an ML multiscale Kain-Fritsch convection parameterization and applies it into WRF. The results indicate that the ML parameterization yields comparable outcomes to the original WRF simulations. However, the motivation behind this study is unclear, and the evaluation of the ML parameterization lacks thoroughness. As a result, I recommend a major revision to address these issues.
Major comments:
- This study focuses on the development of a new machine learning (ML) convection parameterization. However, the training dataset used in this study is derived from the WRF simulation, which may not provide significant benefits for the ML parameterization. The dataset essentially serves as a surrogate for the old parameterization. Instead, a more suitable choice for the dataset could be from a super-parameterization or cloud resolve model, as it could help reduce uncertainties and enhance the ML parameterization's performance.
- It would be beneficial to conduct a comprehensive evaluation of the ML convection parameterization, considering factors beyond just the mean state. In addition to assessing the accuracy of tendencies, it is important to evaluate the parameterization's ability to accurately represent triggered convection. Furthermore, examining the diurnal cycle of precipitation and interpretability of the ML parameterization can provide valuable insights. Additionally, when coupling machine learning parameterization with dynamics, it is crucial to investigate potential issues with simulation stability. Have you tested the parameterization for longer simulations to assess its stability over extended time periods?
- In this study, the machine learning approach performs multi-task learning, simultaneously handling trigger function classification and tendencies regression. It would be valuable to conduct an ablation study, where the classification and regression tasks are separately evaluated and compared with the multi-task learning approach. This analysis can provide insights into the individual contributions and effectiveness of each task, helping to further understand the benefits and limitations of the multi-task learning approach.
- In Figure 4, is it correct that r^2 is the coefficient of determination? It seems unusual that although the points in (a) are widely scattered, the r^2 value is very high. Also, while the RMSE difference between (a) and (b) is large, their r^2 values are quite close.
Minor comments:
- Line 145: Could you please explain the rationale for selecting these particular variables?
- Line 170: could you explain the trigger condition base on lifting condensation level (LCL), convective available potential energy (CAPE), cloud top and base heights, and entrainment rates?
- Line 45: ‘hsave’ to ‘have’
- Caption in Figure 1: ‘5’ to ‘5°’
- Line 149: it could be better to add these 4 variables into Table 1.
- Line 182: remove ‘data’
Citation: https://doi.org/10.5194/egusphere-2023-1967-RC2 - AC2: 'Reply on RC2', Xiaohui Zhong, 05 Feb 2024
Peer review completion
Journal article(s) based on this preprint
Data sets
Machine Learning Parameterization of the Multi-scale Kain-Fritsch (MSKF) Convection Scheme and stable simulation coupled in WRF using WRF-ML v1.0 Xiaohui Zhong, Xing Yu, Hao Li https://doi.org/10.5281/zenodo.10032404
Model code and software
Machine Learning Parameterization of the Multi-scale Kain-Fritsch (MSKF) Convection Scheme and stable simulation coupled in WRF using WRF-ML v1.0 Xiaohui Zhong, Xing Yu, Hao Li https://doi.org/10.5281/zenodo.10032404
WRF Model Version 4.3 William C. Skamarock, Joseph B. Klemp, Jimy Dudhia, David O. Gill, Zhiquan Liu, Judith Berner, Wei Wang, J. G. Powers, M. G. Duda, D. M. Barker, and others https://github.com/wrf-model/WRF
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
318 | 129 | 30 | 477 | 21 | 16 |
- HTML: 318
- PDF: 129
- XML: 30
- Total: 477
- BibTeX: 21
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
Xiaohui Zhong
Xing Yu
Hao Li
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4224 KB) - Metadata XML