the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Deep learning models for generation of precipitation maps based on Numerical Weather Prediction
Abstract. Numerical Weather Prediction models (NWP) are atmospheric simulations that imitate the dynamics of the atmosphere and provide high-quality forecasts. One of the most significant limitations of NWP is the elevated amount of computational resources required for its functioning, which limits the spatial and temporal resolution of the outputs. Traditional meteorological techniques to increase the resolution are based uniquely on information from a limited group of interest variables. In this study, we offer an alternative approach to the task where we generate precipitation maps based on the complete set of variables of the NWP to generate high-resolution and short-time precipitation predictions. To achieve this, five different deep learning models were trained and evaluated: baseline, U-Net, two deconvolutional networks, and one conditional generative model (CGAN). A total of 20 independent random initializations were performed for each of the models. The predictions were evaluated using MAE and LEPS-based skill scores, ETS, CSI, and frequency bias after applying several thresholds. The models showed a significant improvement in predicting precipitation showing the benefits of including the complete information from the NWP. The algorithms increased the resolution of the predictions and corrected an over-forecast bias from the input information. However, some new models presented new types of bias: U-Net tended to mid-range precipitation events, and the deconvolutional models favored low rain events and generated some spatial smoothing. The CGAN offered the highest quality precipitation forecast generating realistic outputs and indicating possible future research paths.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(2118 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2118 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
CEC1: 'Comment on egusphere-2022-648', Juan Antonio Añel, 24 Aug 2022
Dear authors,
We have detected some missing material in your manuscript. Since your work is an application of deep learning techniques, we need that you include with your manuscript the input and output files used and obtained. You state that you are using as input data output files from COSMO. We do not need the complete output files from COSMO but the fields and variables you use. This is important information to assure minimum replicability of your results, as they depend critically on the input and much more than works using other models.
Therefore, please, publish the data in one of the appropriate repositories (b2share is ok), and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage.
Also, in this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, with the DOI of the data.
Juan A. Añel
Geosci. Model Dev. Exec. EditorCitation: https://doi.org/10.5194/egusphere-2022-648-CEC1 -
AC1: 'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
Dear Editor,
We are very sorry for our late reply. We faced some difficulties moving the data from JSC HPC to B2Share, which delayed our response.
The data can be found under the direction: http://doi.org/10.23728/b2share.60c69270d36243779fd771c2fd81fc87 and this link will be included in the paper.
Unfortunately due to B2Share limitations, we were not able to additionally share the final tfrecords. However, using Jupyter Notebook preprocessing.ipynb provided in the git of the paper (https://github.com/DeepRainProject/models_for_radar) the tfrecords files can be produced using the provided files.
Best regards,
Citation: https://doi.org/10.5194/egusphere-2022-648-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
Dear authors,
Checking the data availability of your manuscript, and after a check of the terms of b2hare, it has come out that we can not accept it as a suitable repository. In terms of service, b2share states that data storage is only assured for two years, and such a period is not enough. We usually request a minimum of ten years, ideally twenty.
Therefore, please, could you change the repository to a suitable one? For example, we could accept Zenodo.org, PANGAEA or FigShare.
Many thanks,
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2022-648-CEC2 -
AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
Dear Editor,
We have uploaded the datasets to a new repository in Zenodo.org. You can find the files under the direction: https://doi.org/10.5281/zenodo.7244319
Best regards,
Citation: https://doi.org/10.5194/egusphere-2022-648-AC4
-
AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
-
AC1: 'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
-
RC1: 'Comment on egusphere-2022-648', Anonymous Referee #1, 05 Sep 2022
comments:
The authors trained and evaluated five different deep learning models to generate precipitation maps. The manuscript is well-written in overall and provide clear description of results. There are some comments to help the authors to further improve this manuscript.
1. Introduction section
(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
(2) What is the novelty of this study? Just complete set of variables?
(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resources
needed to generate forecasts” and so on. Please double check the whole text.
2. Data section
(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
(3) Please give full name of “RW, YW, RQ” when they are first used in this study.
3. Methodology section
Please give more descriptions about five deep learning models, such as:
(1ï¼Why choose these five models in this study?
(2) How to setup these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
(3) The setup is done by this work or referred from other study?
4. Result section
In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
5. Conclusions section
In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
Citation: https://doi.org/10.5194/egusphere-2022-648-RC1 -
AC2: 'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
Dear Referee 1,
Thanks for your comments and observations. These will allow us to improve the readability of our paper. Following I attach a specific response to each of your questions.
Best regards,
- Introduction section
(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
We developed different algorithms to generate precipitation maps and compare the algorithms in their performance to obtain information about how to solve best this specific problem. Our results show that the CGANs between the different models generated the highest quality precipitation maps. We will clarify this in the text.
(2) What is the novelty of this study? Just complete set of variables?
The novelty of this study is to use the complete set of variables from the atmospheric simulation and the combination of two steps into one: downscaling the precipitation forecast and correcting its bias with a single algorithm. This will be made clear in the text.
(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resource needed to generate forecasts” and so on. Please double check the whole text.
References will be added to the manuscript.
- Data section
(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
We provide a basic description of the precipitation properties of the target area. We will provide additional information about the area. The mentioned quote is not our finding but known information about the region. We will add the respective references.
(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
This information will be added.
(3) Please give full name of “RW, YW, RQ” when they are first used in this study.
This will be corrected.
- Methodology section
Please give more descriptions about five deep learning models, such as:
(1) Why choose these five models in this study?
Two of the models are chosen as references: the baseline more is one of the most basic models possible, U-Net was previously used in the literature to solve very similar problems. The rest of the models are original contributions from this work and are selected according to the literature and the proven performance to solve this task.
(2) How to set up these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
Deep learning models allow performing an automatic selection of the relevant variables by the estimation of the weights. It also manages the calibration and validation of the parameters. The correlation of independent variables is managed automatically by adjusting the weights during backpropagation. The good performance of the models was tested by evaluating the predictions on an independent test set.
(3) The setup is done by this work or referred from other study?
This is an original contribution from the study.
- Result section
In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
Table 2 presents the scores of the COSMO-DE-EPs original data. It is actually a good original performance. But the deep learning models that we trained based on this information obtain a significant improvement in their fidelity, in comparison with the original COSMO-DE-EPs. This information will be clarified in the manuscript to avoid confusion.
- Conclusions section
In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
An additional paragraph with considerations about the limitations of the models will be added to the discussion.
Citation: https://doi.org/10.5194/egusphere-2022-648-AC2
-
AC2: 'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
-
EC1: 'Comment on egusphere-2022-648', Chanh Kieu, 08 Oct 2022
In this study, the authors attempt to enhance low-resolution precipitation forecast maps from the NWP model by using deep learning models. Several deep learning models used in this study are examined, which include UNet, 2 deconvolution networks, CGANs, and a baseline. Their results demonstrate that direct mapping between physical simulations and precipitation maps can be achieved by DL models. However, the accuracy of predicting precipitation maps using their ML methods is still a challenge.
Overall, both the merit and the approach used in this study are interesting and worth consideration for publication on EGUsphere. I have only a few minor questions to help readers better follow the significance and methodology presented in this study.
1. It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
2. Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
3. As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
Citation: https://doi.org/10.5194/egusphere-2022-648-EC1 -
AC3: 'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
Dear Editor:
We appreciate your comments and observations about our paper. In the following, we respond to each of the points enumerated.
Best regards,
- It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
The 143 variables correspond to the ensemble statistics (mean and standard deviation) of the 20 ensemble members of the COSMO-DE-EPS, which represent simulations for several atmospherical variables (wind speed, temperature, pressure, etc.) and soil and surface variables (water vapor on the surface, amount of snow, etc.). As you correctly guessed, we provide the different variables as input channels for the DL model (we tried to illustrate this in Figure 2). The DL models combined the input channels in a non-linear fashion using the deconvolutional kernels to generate the high-definition precipitation map. Additional information about the input variables will be added to the paper following this consideration.
- Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
This is a very important point. Our approach was to include sufficient input information about the atmospherical state (143 channels), together with a sufficiently complex model, and let the model automatically discover the relevant patterns in the input data that improve precipitation. This means that the complex DL models learn the relevant non-linear interactions between the input information for each type of precipitation, without an explicit classification from our side.
Given the improvement in the performance obtained by our models, we could assume that the DL algorithms learned to differentiate between the different types of precipitation and use the proper and filter out the unimportant information in each case. We consider it essential to include enough information about the meteorological state so that the right mapping can be performed. Additional consideration about this will be included as part of the discussion.
- As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
We will adapt our methodology section and code to meet the standard guidelines.
Citation: https://doi.org/10.5194/egusphere-2022-648-AC3
-
AC3: 'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
-
RC2: 'Comment on egusphere-2022-648', Anonymous Referee #2, 01 Dec 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-648/egusphere-2022-648-RC2-supplement.pdf
Interactive discussion
Status: closed
-
CEC1: 'Comment on egusphere-2022-648', Juan Antonio Añel, 24 Aug 2022
Dear authors,
We have detected some missing material in your manuscript. Since your work is an application of deep learning techniques, we need that you include with your manuscript the input and output files used and obtained. You state that you are using as input data output files from COSMO. We do not need the complete output files from COSMO but the fields and variables you use. This is important information to assure minimum replicability of your results, as they depend critically on the input and much more than works using other models.
Therefore, please, publish the data in one of the appropriate repositories (b2share is ok), and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage.
Also, in this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, with the DOI of the data.
Juan A. Añel
Geosci. Model Dev. Exec. EditorCitation: https://doi.org/10.5194/egusphere-2022-648-CEC1 -
AC1: 'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
Dear Editor,
We are very sorry for our late reply. We faced some difficulties moving the data from JSC HPC to B2Share, which delayed our response.
The data can be found under the direction: http://doi.org/10.23728/b2share.60c69270d36243779fd771c2fd81fc87 and this link will be included in the paper.
Unfortunately due to B2Share limitations, we were not able to additionally share the final tfrecords. However, using Jupyter Notebook preprocessing.ipynb provided in the git of the paper (https://github.com/DeepRainProject/models_for_radar) the tfrecords files can be produced using the provided files.
Best regards,
Citation: https://doi.org/10.5194/egusphere-2022-648-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
Dear authors,
Checking the data availability of your manuscript, and after a check of the terms of b2hare, it has come out that we can not accept it as a suitable repository. In terms of service, b2share states that data storage is only assured for two years, and such a period is not enough. We usually request a minimum of ten years, ideally twenty.
Therefore, please, could you change the repository to a suitable one? For example, we could accept Zenodo.org, PANGAEA or FigShare.
Many thanks,
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2022-648-CEC2 -
AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
Dear Editor,
We have uploaded the datasets to a new repository in Zenodo.org. You can find the files under the direction: https://doi.org/10.5281/zenodo.7244319
Best regards,
Citation: https://doi.org/10.5194/egusphere-2022-648-AC4
-
AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
-
AC1: 'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
-
RC1: 'Comment on egusphere-2022-648', Anonymous Referee #1, 05 Sep 2022
comments:
The authors trained and evaluated five different deep learning models to generate precipitation maps. The manuscript is well-written in overall and provide clear description of results. There are some comments to help the authors to further improve this manuscript.
1. Introduction section
(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
(2) What is the novelty of this study? Just complete set of variables?
(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resources
needed to generate forecasts” and so on. Please double check the whole text.
2. Data section
(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
(3) Please give full name of “RW, YW, RQ” when they are first used in this study.
3. Methodology section
Please give more descriptions about five deep learning models, such as:
(1ï¼Why choose these five models in this study?
(2) How to setup these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
(3) The setup is done by this work or referred from other study?
4. Result section
In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
5. Conclusions section
In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
Citation: https://doi.org/10.5194/egusphere-2022-648-RC1 -
AC2: 'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
Dear Referee 1,
Thanks for your comments and observations. These will allow us to improve the readability of our paper. Following I attach a specific response to each of your questions.
Best regards,
- Introduction section
(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
We developed different algorithms to generate precipitation maps and compare the algorithms in their performance to obtain information about how to solve best this specific problem. Our results show that the CGANs between the different models generated the highest quality precipitation maps. We will clarify this in the text.
(2) What is the novelty of this study? Just complete set of variables?
The novelty of this study is to use the complete set of variables from the atmospheric simulation and the combination of two steps into one: downscaling the precipitation forecast and correcting its bias with a single algorithm. This will be made clear in the text.
(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resource needed to generate forecasts” and so on. Please double check the whole text.
References will be added to the manuscript.
- Data section
(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
We provide a basic description of the precipitation properties of the target area. We will provide additional information about the area. The mentioned quote is not our finding but known information about the region. We will add the respective references.
(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
This information will be added.
(3) Please give full name of “RW, YW, RQ” when they are first used in this study.
This will be corrected.
- Methodology section
Please give more descriptions about five deep learning models, such as:
(1) Why choose these five models in this study?
Two of the models are chosen as references: the baseline more is one of the most basic models possible, U-Net was previously used in the literature to solve very similar problems. The rest of the models are original contributions from this work and are selected according to the literature and the proven performance to solve this task.
(2) How to set up these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
Deep learning models allow performing an automatic selection of the relevant variables by the estimation of the weights. It also manages the calibration and validation of the parameters. The correlation of independent variables is managed automatically by adjusting the weights during backpropagation. The good performance of the models was tested by evaluating the predictions on an independent test set.
(3) The setup is done by this work or referred from other study?
This is an original contribution from the study.
- Result section
In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
Table 2 presents the scores of the COSMO-DE-EPs original data. It is actually a good original performance. But the deep learning models that we trained based on this information obtain a significant improvement in their fidelity, in comparison with the original COSMO-DE-EPs. This information will be clarified in the manuscript to avoid confusion.
- Conclusions section
In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
An additional paragraph with considerations about the limitations of the models will be added to the discussion.
Citation: https://doi.org/10.5194/egusphere-2022-648-AC2
-
AC2: 'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
-
EC1: 'Comment on egusphere-2022-648', Chanh Kieu, 08 Oct 2022
In this study, the authors attempt to enhance low-resolution precipitation forecast maps from the NWP model by using deep learning models. Several deep learning models used in this study are examined, which include UNet, 2 deconvolution networks, CGANs, and a baseline. Their results demonstrate that direct mapping between physical simulations and precipitation maps can be achieved by DL models. However, the accuracy of predicting precipitation maps using their ML methods is still a challenge.
Overall, both the merit and the approach used in this study are interesting and worth consideration for publication on EGUsphere. I have only a few minor questions to help readers better follow the significance and methodology presented in this study.
1. It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
2. Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
3. As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
Citation: https://doi.org/10.5194/egusphere-2022-648-EC1 -
AC3: 'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
Dear Editor:
We appreciate your comments and observations about our paper. In the following, we respond to each of the points enumerated.
Best regards,
- It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
The 143 variables correspond to the ensemble statistics (mean and standard deviation) of the 20 ensemble members of the COSMO-DE-EPS, which represent simulations for several atmospherical variables (wind speed, temperature, pressure, etc.) and soil and surface variables (water vapor on the surface, amount of snow, etc.). As you correctly guessed, we provide the different variables as input channels for the DL model (we tried to illustrate this in Figure 2). The DL models combined the input channels in a non-linear fashion using the deconvolutional kernels to generate the high-definition precipitation map. Additional information about the input variables will be added to the paper following this consideration.
- Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
This is a very important point. Our approach was to include sufficient input information about the atmospherical state (143 channels), together with a sufficiently complex model, and let the model automatically discover the relevant patterns in the input data that improve precipitation. This means that the complex DL models learn the relevant non-linear interactions between the input information for each type of precipitation, without an explicit classification from our side.
Given the improvement in the performance obtained by our models, we could assume that the DL algorithms learned to differentiate between the different types of precipitation and use the proper and filter out the unimportant information in each case. We consider it essential to include enough information about the meteorological state so that the right mapping can be performed. Additional consideration about this will be included as part of the discussion.
- As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
We will adapt our methodology section and code to meet the standard guidelines.
Citation: https://doi.org/10.5194/egusphere-2022-648-AC3
-
AC3: 'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
-
RC2: 'Comment on egusphere-2022-648', Anonymous Referee #2, 01 Dec 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-648/egusphere-2022-648-RC2-supplement.pdf
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
723 | 249 | 26 | 998 | 5 | 7 |
- HTML: 723
- PDF: 249
- XML: 26
- Total: 998
- BibTeX: 5
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Adrian Rojas-Campos
Michael Langguth
Martin Wittenbrink
Gordon Pipa
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2118 KB) - Metadata XML