Deep learning models for generation of precipitation maps based on Numerical Weather Prediction

Rojas-Campos, Adrian; Langguth, Michael; Wittenbrink, Martin; Pipa, Gordon

doi:10.5194/egusphere-2022-648

Preprints

https://doi.org/10.5194/egusphere-2022-648

Preprints

04 Aug 2022

| 04 Aug 2022

Deep learning models for generation of precipitation maps based on Numerical Weather Prediction

Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa

Abstract. Numerical Weather Prediction models (NWP) are atmospheric simulations that imitate the dynamics of the atmosphere and provide high-quality forecasts. One of the most significant limitations of NWP is the elevated amount of computational resources required for its functioning, which limits the spatial and temporal resolution of the outputs. Traditional meteorological techniques to increase the resolution are based uniquely on information from a limited group of interest variables. In this study, we offer an alternative approach to the task where we generate precipitation maps based on the complete set of variables of the NWP to generate high-resolution and short-time precipitation predictions. To achieve this, five different deep learning models were trained and evaluated: baseline, U-Net, two deconvolutional networks, and one conditional generative model (CGAN). A total of 20 independent random initializations were performed for each of the models. The predictions were evaluated using MAE and LEPS-based skill scores, ETS, CSI, and frequency bias after applying several thresholds. The models showed a significant improvement in predicting precipitation showing the benefits of including the complete information from the NWP. The algorithms increased the resolution of the predictions and corrected an over-forecast bias from the input information. However, some new models presented new types of bias: U-Net tended to mid-range precipitation events, and the deconvolutional models favored low rain events and generated some spatial smoothing. The CGAN offered the highest quality precipitation forecast generating realistic outputs and indicating possible future research paths.

Received: 13 Jul 2022 – Discussion started: 04 Aug 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2118 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (2118 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

08 Mar 2023

Deep learning models for generation of precipitation maps based on numerical weather prediction

Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa

Geosci. Model Dev., 16, 1467–1480, https://doi.org/10.5194/gmd-16-1467-2023,https://doi.org/10.5194/gmd-16-1467-2023, 2023

Short summary

Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2022-648', Juan Antonio Añel, 24 Aug 2022

Dear authors,
We have detected some missing material in your manuscript. Since your work is an application of deep learning techniques, we need that you include with your manuscript the input and output files used and obtained. You state that you are using as input data output files from COSMO. We do not need the complete output files from COSMO but the fields and variables you use. This is important information to assure minimum replicability of your results, as they depend critically on the input and much more than works using other models.
Therefore, please, publish the data in one of the appropriate repositories (b2share is ok), and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage.
Also, in this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, with the DOI of the data.
Juan A. Añel

Geosci. Model Dev. Exec. Editor

Citation: https://doi.org/10.5194/egusphere-2022-648-CEC1
- AC1:
  'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
  
  Dear Editor,
  We are very sorry for our late reply. We faced some difficulties moving the data from JSC HPC to B2Share, which delayed our response.
  The data can be found under the direction: http://doi.org/10.23728/b2share.60c69270d36243779fd771c2fd81fc87 and this link will be included in the paper.
  Unfortunately due to B2Share limitations, we were not able to additionally share the final tfrecords. However, using Jupyter Notebook preprocessing.ipynb provided in the git of the paper (https://github.com/DeepRainProject/models_for_radar) the tfrecords files can be produced using the provided files.
  Best regards,
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
    
    Dear authors,
    Checking the data availability of your manuscript, and after a check of the terms of b2hare, it has come out that we can not accept it as a suitable repository. In terms of service, b2share states that data storage is only assured for two years, and such a period is not enough. We usually request a minimum of ten years, ideally twenty.
    Therefore, please, could you change the repository to a suitable one? For example, we could accept Zenodo.org, PANGAEA or FigShare.
    Many thanks,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-648-CEC2
    
    AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
    
    Dear Editor,
    We have uploaded the datasets to a new repository in Zenodo.org. You can find the files under the direction: https://doi.org/10.5281/zenodo.7244319
    Best regards,
    
    Citation: https://doi.org/10.5194/egusphere-2022-648-AC4
RC1:
'Comment on egusphere-2022-648', Anonymous Referee #1, 05 Sep 2022

comments:

The authors trained and evaluated five different deep learning models to generate precipitation maps. The manuscript is well-written in overall and provide clear description of results. There are some comments to help the authors to further improve this manuscript.

1. Introduction section

(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?

(2) What is the novelty of this study? Just complete set of variables?

(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resources

needed to generate forecasts” and so on. Please double check the whole text.

2. Data section

(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.

(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.

(3) Please give full name of “RW, YW, RQ” when they are first used in this study.

3. Methodology section

Please give more descriptions about five deep learning models, such as:

(1ï¼Why choose these five models in this study?

(2) How to setup these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?

(3) The setup is done by this work or referred from other study?

4. Result section

In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.

5. Conclusions section

In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.

Citation: https://doi.org/10.5194/egusphere-2022-648-RC1
- AC2:
  'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
  Dear Referee 1,
  Thanks for your comments and observations. These will allow us to improve the readability of our paper. Following I attach a specific response to each of your questions.
  Best regards,
  Introduction section
  
  (1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
  We developed different algorithms to generate precipitation maps and compare the algorithms in their performance to obtain information about how to solve best this specific problem. Our results show that the CGANs between the different models generated the highest quality precipitation maps. We will clarify this in the text.
  (2) What is the novelty of this study? Just complete set of variables?
  The novelty of this study is to use the complete set of variables from the atmospheric simulation and the combination of two steps into one: downscaling the precipitation forecast and correcting its bias with a single algorithm. This will be made clear in the text.
  (3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resource needed to generate forecasts” and so on. Please double check the whole text.
  References will be added to the manuscript.
  Data section
  
  (1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
  We provide a basic description of the precipitation properties of the target area. We will provide additional information about the area. The mentioned quote is not our finding but known information about the region. We will add the respective references.
  (2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
  This information will be added.
  (3) Please give full name of “RW, YW, RQ” when they are first used in this study.
  This will be corrected.
  Methodology section
  
  Please give more descriptions about five deep learning models, such as:
  (1) Why choose these five models in this study?
  Two of the models are chosen as references: the baseline more is one of the most basic models possible, U-Net was previously used in the literature to solve very similar problems. The rest of the models are original contributions from this work and are selected according to the literature and the proven performance to solve this task.
  (2) How to set up these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
  Deep learning models allow performing an automatic selection of the relevant variables by the estimation of the weights. It also manages the calibration and validation of the parameters. The correlation of independent variables is managed automatically by adjusting the weights during backpropagation. The good performance of the models was tested by evaluating the predictions on an independent test set.
  (3) The setup is done by this work or referred from other study?
  This is an original contribution from the study.
  Result section
  
  In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
  Table 2 presents the scores of the COSMO-DE-EPs original data. It is actually a good original performance. But the deep learning models that we trained based on this information obtain a significant improvement in their fidelity, in comparison with the original COSMO-DE-EPs. This information will be clarified in the manuscript to avoid confusion.
  Conclusions section
  
  In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
  An additional paragraph with considerations about the limitations of the models will be added to the discussion.
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC2
EC1:
'Comment on egusphere-2022-648', Chanh Kieu, 08 Oct 2022

In this study, the authors attempt to enhance low-resolution precipitation forecast maps from the NWP model by using deep learning models. Several deep learning models used in this study are examined, which include UNet, 2 deconvolution networks, CGANs, and a baseline. Their results demonstrate that direct mapping between physical simulations and precipitation maps can be achieved by DL models. However, the accuracy of predicting precipitation maps using their ML methods is still a challenge.
Overall, both the merit and the approach used in this study are interesting and worth consideration for publication on EGUsphere. I have only a few minor questions to help readers better follow the significance and methodology presented in this study.
1. It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
2. Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
3. As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.

Citation: https://doi.org/10.5194/egusphere-2022-648-EC1
- AC3:
  'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
  Dear Editor:
  We appreciate your comments and observations about our paper. In the following, we respond to each of the points enumerated.
  Best regards,
  It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
  
  The 143 variables correspond to the ensemble statistics (mean and standard deviation) of the 20 ensemble members of the COSMO-DE-EPS, which represent simulations for several atmospherical variables (wind speed, temperature, pressure, etc.) and soil and surface variables (water vapor on the surface, amount of snow, etc.). As you correctly guessed, we provide the different variables as input channels for the DL model (we tried to illustrate this in Figure 2). The DL models combined the input channels in a non-linear fashion using the deconvolutional kernels to generate the high-definition precipitation map. Additional information about the input variables will be added to the paper following this consideration.
  Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
  
  This is a very important point. Our approach was to include sufficient input information about the atmospherical state (143 channels), together with a sufficiently complex model, and let the model automatically discover the relevant patterns in the input data that improve precipitation. This means that the complex DL models learn the relevant non-linear interactions between the input information for each type of precipitation, without an explicit classification from our side.
  Given the improvement in the performance obtained by our models, we could assume that the DL algorithms learned to differentiate between the different types of precipitation and use the proper and filter out the unimportant information in each case. We consider it essential to include enough information about the meteorological state so that the right mapping can be performed. Additional consideration about this will be included as part of the discussion.
  As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
  
  We will adapt our methodology section and code to meet the standard guidelines.
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC3
RC2: 'Comment on egusphere-2022-648', Anonymous Referee #2, 01 Dec 2022

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-648/egusphere-2022-648-RC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2022-648-RC2

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2022-648', Juan Antonio Añel, 24 Aug 2022

Dear authors,
We have detected some missing material in your manuscript. Since your work is an application of deep learning techniques, we need that you include with your manuscript the input and output files used and obtained. You state that you are using as input data output files from COSMO. We do not need the complete output files from COSMO but the fields and variables you use. This is important information to assure minimum replicability of your results, as they depend critically on the input and much more than works using other models.
Therefore, please, publish the data in one of the appropriate repositories (b2share is ok), and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage.
Also, in this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, with the DOI of the data.
Juan A. Añel

Geosci. Model Dev. Exec. Editor

Citation: https://doi.org/10.5194/egusphere-2022-648-CEC1
- AC1:
  'Reply on CEC1', Adrian Rojas Campos, 04 Sep 2022
  
  Dear Editor,
  We are very sorry for our late reply. We faced some difficulties moving the data from JSC HPC to B2Share, which delayed our response.
  The data can be found under the direction: http://doi.org/10.23728/b2share.60c69270d36243779fd771c2fd81fc87 and this link will be included in the paper.
  Unfortunately due to B2Share limitations, we were not able to additionally share the final tfrecords. However, using Jupyter Notebook preprocessing.ipynb provided in the git of the paper (https://github.com/DeepRainProject/models_for_radar) the tfrecords files can be produced using the provided files.
  Best regards,
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2022
    
    Dear authors,
    Checking the data availability of your manuscript, and after a check of the terms of b2hare, it has come out that we can not accept it as a suitable repository. In terms of service, b2share states that data storage is only assured for two years, and such a period is not enough. We usually request a minimum of ten years, ideally twenty.
    Therefore, please, could you change the repository to a suitable one? For example, we could accept Zenodo.org, PANGAEA or FigShare.
    Many thanks,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-648-CEC2
    
    AC4: 'Reply on CEC2', Adrian Rojas Campos, 24 Oct 2022
    
    Dear Editor,
    We have uploaded the datasets to a new repository in Zenodo.org. You can find the files under the direction: https://doi.org/10.5281/zenodo.7244319
    Best regards,
    
    Citation: https://doi.org/10.5194/egusphere-2022-648-AC4
RC1:
'Comment on egusphere-2022-648', Anonymous Referee #1, 05 Sep 2022

comments:

The authors trained and evaluated five different deep learning models to generate precipitation maps. The manuscript is well-written in overall and provide clear description of results. There are some comments to help the authors to further improve this manuscript.

1. Introduction section

(1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?

(2) What is the novelty of this study? Just complete set of variables?

(3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resources

needed to generate forecasts” and so on. Please double check the whole text.

2. Data section

(1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.

(2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.

(3) Please give full name of “RW, YW, RQ” when they are first used in this study.

3. Methodology section

Please give more descriptions about five deep learning models, such as:

(1ï¼Why choose these five models in this study?

(2) How to setup these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?

(3) The setup is done by this work or referred from other study?

4. Result section

In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.

5. Conclusions section

In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.

Citation: https://doi.org/10.5194/egusphere-2022-648-RC1
- AC2:
  'Reply on RC1', Adrian Rojas Campos, 27 Sep 2022
  Dear Referee 1,
  Thanks for your comments and observations. These will allow us to improve the readability of our paper. Following I attach a specific response to each of your questions.
  Best regards,
  Introduction section
  
  (1) there is a confusion about scientific question. Did it compare different model algorithms or generate precipitation map or both?
  We developed different algorithms to generate precipitation maps and compare the algorithms in their performance to obtain information about how to solve best this specific problem. Our results show that the CGANs between the different models generated the highest quality precipitation maps. We will clarify this in the text.
  (2) What is the novelty of this study? Just complete set of variables?
  The novelty of this study is to use the complete set of variables from the atmospheric simulation and the combination of two steps into one: downscaling the precipitation forecast and correcting its bias with a single algorithm. This will be made clear in the text.
  (3) There is a lack of some references in this section, such as line 22 “Among the most successful methods are the Numerical Weather Predictions models (NWP), which consist of systems of equations that simulate the dynamics of the atmosphere and provide highly accurate weather forecasts over long periods.” Line 27 “However, NWP models still preserve some limitations, the most important being the large number of computational resource needed to generate forecasts” and so on. Please double check the whole text.
  References will be added to the manuscript.
  Data section
  
  (1) I think it is not well-structured. There is no information about study area, and some information lacks of reference. Thus, I do not know is it your result or others, such as “The south-western part is characterized by low precipitation amounts between 500 and 700mm/yr due to lee effects of the Eifel mountain range”.
  We provide a basic description of the precipitation properties of the target area. We will provide additional information about the area. The mentioned quote is not our finding but known information about the region. We will add the respective references.
  (2) Please give more descriptions about COSMO-DE-EPS forecast, providing more information or reliability for readers.
  This information will be added.
  (3) Please give full name of “RW, YW, RQ” when they are first used in this study.
  This will be corrected.
  Methodology section
  
  Please give more descriptions about five deep learning models, such as:
  (1) Why choose these five models in this study?
  Two of the models are chosen as references: the baseline more is one of the most basic models possible, U-Net was previously used in the literature to solve very similar problems. The rest of the models are original contributions from this work and are selected according to the literature and the proven performance to solve this task.
  (2) How to set up these models in this study? For example, how to choose variables? How to calibrate and validate model parameters? How to deal with the correlation of independent variables?
  Deep learning models allow performing an automatic selection of the relevant variables by the estimation of the weights. It also manages the calibration and validation of the parameters. The correlation of independent variables is managed automatically by adjusting the weights during backpropagation. The good performance of the models was tested by evaluating the predictions on an independent test set.
  (3) The setup is done by this work or referred from other study?
  This is an original contribution from the study.
  Result section
  
  In this section, the key information is from Table 2 and Figure 3. But it confused me that COSMO-DE-EPS data in Table 2 is original data or after correction? Maybe I missed some information. If it had been corrected by observation, it is not surprised that it has good performance.
  Table 2 presents the scores of the COSMO-DE-EPs original data. It is actually a good original performance. But the deep learning models that we trained based on this information obtain a significant improvement in their fidelity, in comparison with the original COSMO-DE-EPs. This information will be clarified in the manuscript to avoid confusion.
  Conclusions section
  
  In this section, authors presented a summary of results. Maybe authors can discuss some uncertainties about five models, such as the parameters and the influences of model uncertainty on precipitation generation results.
  An additional paragraph with considerations about the limitations of the models will be added to the discussion.
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC2
EC1:
'Comment on egusphere-2022-648', Chanh Kieu, 08 Oct 2022

In this study, the authors attempt to enhance low-resolution precipitation forecast maps from the NWP model by using deep learning models. Several deep learning models used in this study are examined, which include UNet, 2 deconvolution networks, CGANs, and a baseline. Their results demonstrate that direct mapping between physical simulations and precipitation maps can be achieved by DL models. However, the accuracy of predicting precipitation maps using their ML methods is still a challenge.
Overall, both the merit and the approach used in this study are interesting and worth consideration for publication on EGUsphere. I have only a few minor questions to help readers better follow the significance and methodology presented in this study.
1. It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
2. Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
3. As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.

Citation: https://doi.org/10.5194/egusphere-2022-648-EC1
- AC3:
  'Reply on EC1', Adrian Rojas Campos, 12 Oct 2022
  Dear Editor:
  We appreciate your comments and observations about our paper. In the following, we respond to each of the points enumerated.
  Best regards,
  It is mentioned in this study that 143 forecast variables are used as input for the deep learning models in this study. However, there is nowhere in the text mentioning what are these variables and why are they relevant to precipitation augmentation. I am wondering what are the possible 143 different variables that a NWP model can produce, and how these variables are handled in your DL approach. Are they treated as different channels of input? If not, how are they combined and/or fed in your DL designs? It would be useful for readers to know more details about what these variables are and how are they used as input for your DL models.
  
  The 143 variables correspond to the ensemble statistics (mean and standard deviation) of the 20 ensemble members of the COSMO-DE-EPS, which represent simulations for several atmospherical variables (wind speed, temperature, pressure, etc.) and soil and surface variables (water vapor on the surface, amount of snow, etc.). As you correctly guessed, we provide the different variables as input channels for the DL model (we tried to illustrate this in Figure 2). The DL models combined the input channels in a non-linear fashion using the deconvolutional kernels to generate the high-definition precipitation map. Additional information about the input variables will be added to the paper following this consideration.
  Precipitation is generally a subtle variable, which is an end product of many processes and scales in a numerical model. While the overall objective of this work is to enhance the model precipitation output by using ML, the characteristics of different types of precipitation such as stratiform or convective precipitation are very different. I am not sure if the ML models can help distinguish these different types of precipitation, which is in fact related to my comment # 1 above on the use of 143 forecast variables as input. There is no discussion of how many of these variables are essential for different types of precipitation. One could of course combine all possible input and see what potential outcome from an ML model could be. However, more input channels do not generally lead to a better outcome, since some bad channels could degrade the ML performance. Any discussion on the relative importance of different input variables for different types of precipitation would be helpful here.
  
  This is a very important point. Our approach was to include sufficient input information about the atmospherical state (143 channels), together with a sufficiently complex model, and let the model automatically discover the relevant patterns in the input data that improve precipitation. This means that the complex DL models learn the relevant non-linear interactions between the input information for each type of precipitation, without an explicit classification from our side.
  Given the improvement in the performance obtained by our models, we could assume that the DL algorithms learned to differentiate between the different types of precipitation and use the proper and filter out the unimportant information in each case. We consider it essential to include enough information about the meteorological state so that the right mapping can be performed. Additional consideration about this will be included as part of the discussion.
  As a “model description paper”, the manuscript is expected to be detailed and accessible for a wide range of geophysical communities as described here https://www.geoscientific-model-development.net/about/manuscript_types.html#item1. However, the current methodology section (section 3) is too brief for readers to follow and appreciate your ML model settings and approach. Please provide additional information as instructed in the link above to meet the standard guideline of EGUsphere.
  
  We will adapt our methodology section and code to meet the standard guidelines.
  
  Citation: https://doi.org/10.5194/egusphere-2022-648-AC3
RC2: 'Comment on egusphere-2022-648', Anonymous Referee #2, 01 Dec 2022

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-648/egusphere-2022-648-RC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2022-648-RC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Adrian Rojas Campos on behalf of the Authors (16 Jan 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (19 Jan 2023) by Chanh Kieu

RR by Anonymous Referee #1 (28 Jan 2023)

ED: Publish as is (31 Jan 2023) by Chanh Kieu

AR by Adrian Rojas Campos on behalf of the Authors (02 Feb 2023) Manuscript

Journal article(s) based on this preprint

08 Mar 2023

Deep learning models for generation of precipitation maps based on numerical weather prediction

Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa

Geosci. Model Dev., 16, 1467–1480, https://doi.org/10.5194/gmd-16-1467-2023,https://doi.org/10.5194/gmd-16-1467-2023, 2023

Short summary

Adrian Rojas-Campos, Michael Langguth, Martin Wittenbrink, and Gordon Pipa

Viewed

Total article views: 998 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
723	249	26	998	5	7

HTML: 723
PDF: 249
XML: 26
Total: 998
BibTeX: 5
EndNote: 7

Views and downloads (calculated since 04 Aug 2022)

Month	HTML	PDF	XML	Total
Aug 2022	211	61	7	279
Sep 2022	117	39	6	162
Oct 2022	128	46	9	183
Nov 2022	80	30	0	110
Dec 2022	59	18	3	80
Jan 2023	43	22	0	65
Feb 2023	72	29	0	101
Mar 2023	13	4	1	18
Apr 2023	0
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0
Oct 2024	0
Nov 2024	0
Dec 2024	0
Jan 2025	0
Feb 2025	0
Mar 2025	0
Apr 2025	0
May 2025	0
Jun 2025	0
Jul 2025	0
Aug 2025	0
Sep 2025	0
Oct 2025	0
Nov 2025	0
Dec 2025	0
Jan 2026	0
Feb 2026	0
Mar 2026	0
Apr 2026	0

Cumulative views and downloads (calculated since 04 Aug 2022)

Month	HTML	PDF	XML	Total
Aug 2022	211	61	7	279
Sep 2022	117	39	6	162
Oct 2022	128	46	9	183
Nov 2022	80	30	0	110
Dec 2022	59	18	3	80
Jan 2023	43	22	0	65
Feb 2023	72	29	0	101
Mar 2023	13	4	1	18
Apr 2023	0
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0
Oct 2024	0
Nov 2024	0
Dec 2024	0
Jan 2025	0
Feb 2025	0
Mar 2025	0
Apr 2025	0
May 2025	0
Jun 2025	0
Jul 2025	0
Aug 2025	0
Sep 2025	0
Oct 2025	0
Nov 2025	0
Dec 2025	0
Jan 2026	0
Feb 2026	0
Mar 2026	0
Apr 2026	0

Viewed (geographical distribution)

Total article views: 954 (including HTML, PDF, and XML) Thereof 954 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 11 Apr 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (2118 KB)
Metadata XML

Short summary

Our manuscript presents an alternative approach for generating high-resolution precipitation maps based on the non-linear combination of the complete set of variables of the numerical weather predictions. This process combines the super-resolution task with the bias correction in a single step, generating high-resolution corrected precipitation maps with 3 hour lead time. We used using deep learning algorithms to combine the input information and increase the accuracy of the precipitation maps.


Total:	0
HTML:	0
PDF:	0
XML:	0

Deep learning models for generation of precipitation maps based on Numerical Weather Prediction

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)