the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CCdownscaling: an open-source Python package for multivariable statistical climate model downscaling V1.0
Abstract. Statistical downscaling methods provide an essential bridge between low resolution global climate models and localized information needed by decision makers. As the demand for localized climate information continues to grow to make projections for a wide variety of applications, the need for software that can provide this sort of downscaled data grows with it. The CCdownscaling package described in the article provides a number of downscaling methods, including Self Organizing Maps, as well as a number of evaluation metrics for assessing downscale model skill. In this article, we describe the features of the CCdownscaling package, and show an example use case for downscaling temperature and precipitation. It is open-source and freely available for use in generating downscaled projections.
This preprint has been withdrawn.
-
Withdrawal notice
This preprint has been withdrawn.
-
Preprint
(546 KB)
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-282', Anonymous Referee #1, 17 Jun 2022
A potentially useful software package that enables comparison of differnt statstistical downscaling results from multiple techniques with a focus on self-organising maps (SOM). As the package is adaptable to the incorporation of new and different machine learning methods, it would be useful to publish even given the limitations of the research example presented. Some general and technical comments that would improve the manuscript follow.
Line 72-73: No information is given on how or why the input variables for SOM training were selected. At leaset a reference to supporting work, that led to these selections, should be cited.
Line 83: In discussing Fig. 1, and within the figure itself, there is no information on how to use projected variables for climate change impact assessment. Ideally this should be included as part of the example case. For example, would variables from GCMs require bias-correction before use, so that their observed-period statistics match those of the reanalysis data?
Line 116: Regarding Fig. 2, none of the units for the colour gradient or numbers are explained.
Line 117: Regarding Fig. 3, what are the gradient plots x and y axis showing?
Line 133: Re: Table 1, where possible the units of the various metrics should be shown, e.g. what are the units of the bias?
Line 136: You say "quantile mapping is inherently single variable", but the earlier description of SOM doesn't make it clear how it's applied jointly to multiple variables. It appears it's applied independently to precipitation and temperature, is that not the case?
Line 183: Re: Fig. 4, what do the dimensions of a SOM refer to? i.e. what are the horizontal and vertical axis representing?
Line 188: So choice of SOM size is somewhat subjective, based on expert knowledge, rather than any objective or automised optimisation? Is that a weakness?
Line 192: Qmap appears to perform best across the full distribution, is that the case and so worth mentioning?
Line 211: You say "SOM far outperforms the random forest on KS statistic", however RF two part is much closer. That is worth noting?
Line 234: Re: Fig. 7, given SOM underestimates lag-1 autocorrelation, does that mean it's unsuitable for extreme multi-day event precipitation and therefore, for example, flood applications?
Table 2 says "Random Forest does the best job of matching the day-to-day values, with the lowest RMSE", but actually Qmap has the lowest RMSE.
Citation: https://doi.org/10.5194/egusphere-2022-282-RC1 -
RC2: 'Comment on egusphere-2022-282', Anonymous Referee #2, 26 Jun 2022
The authors proposed an open source python package (namely CCdownscaling) for climate model downscaling. The package includes the self-organizing map (SOM), two random forest-based methods, and the quantile mapping approach, which have all been used for downscaling for many years. All the methods including SOM was trained and tested at one location in Chicago, Illinois. In general, the manuscript presents a case of software package development, but not geoscientific model development, which seems to not match with any one of the six aims of the GMD journal. It does not improve current downscaling methods or models, or present a detailed, rigorous descriptions, and evaluations of any new methods. I provide specific comments as follows.
Major issues:
- The novelty of this study is not justified. This study focuses on develop software package for downscaling by packing up a few existing empirical downscaling methods, but does not improve any downscaling techniques, or presenting any new method. We know SOM is an existing downscaling method that have been used for downscaling for many years. It is also not clear about the advantages of SOM compared to the other downscaling approaches. Is it worthwhile to develop such software package?
- There are several existing downscaling packages that did not consider in the manuscript, such as ClimDown in R: https://rdrr.io/cran/ClimDown/, and SBCK in both R and Python: https://github.com/yrobink/SBCK. These two packages include many downscaling and bias correction approaches that have been fully evaluated in the upstream research papers. What is the added value from this CCdownscaling package?
- It is not clear about the advantage of SOM method compared with other approaches for climate downscaling. The authors need to justify the value for developing a software package for SOM.
- The validation does not include any extreme events or indices. The validation results can be highly uncertain since it relies on the data from a single site. Sufficient number of sites covering different conditions are needed for robust evaluations.
- The downscaling method only considers four variables: relative humidity at 850hPa, air temperature at 850hPa, geopotential height at 850 hPa, two wind speed at 700 hPa, while so many physical covariates of precipitation are ignored. Note there are many precipitation covariates available in different reanalysis datasets as well as GCMs simulations.
Minor issues:
- In Section 4, the authors mentioned the SOM advantage of providing insight into the weather patterns giving rise to specific downscaled outcomes through pattern detection. But the explanation about pattern detection is not clear. Figures 2 and 3 are lacking context. The node example (0,4) is not clear. More notation and explanation are needed for pattern detection.
- The performance of SOM approach is similar to the basic quantile mapping approach (see Figure 6 and Figure 7 as well as the two tables). What could be the reason for that and the potential avenues to improve SOM performance?
- Computer code should be put in the appendix, not be part of the main text.
- Figure 6 is not a histogram
Citation: https://doi.org/10.5194/egusphere-2022-282-RC2 -
EC1: 'Comment on egusphere-2022-282', Jatin Kala, 04 Jul 2022
Although it is not usual that handling editors provide comments prior to the authors submitting a revised manuscript, in this instance, I think it will be useful for the authors in this instance to be aware of my overall assessment based on the manuscript, and the reports by the two anonymous reviewers.
I agree with the sentiment of both reviewers, especially reviewer 2 that this manuscript falls short on the criteria of Scientific Significance and Scientific Quality.
The manuscript in its present form, does not present substantial new concepts, ideas or methods in geoscientific model development. The package developed by the authors is no doubt useful and handy, but the novelty is not justified. On scientific quality, the analysis presented is more illustrative of what the package is able to do, rather than an in-depth analysis of different methods so as to better inform geoscientific model development.
In this instance, I would not recommend re-submission, unless the authors are prepared to carry out substantial more work addressing these issues. I hope the authors find this useful.
Citation: https://doi.org/10.5194/egusphere-2022-282-EC1
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-282', Anonymous Referee #1, 17 Jun 2022
A potentially useful software package that enables comparison of differnt statstistical downscaling results from multiple techniques with a focus on self-organising maps (SOM). As the package is adaptable to the incorporation of new and different machine learning methods, it would be useful to publish even given the limitations of the research example presented. Some general and technical comments that would improve the manuscript follow.
Line 72-73: No information is given on how or why the input variables for SOM training were selected. At leaset a reference to supporting work, that led to these selections, should be cited.
Line 83: In discussing Fig. 1, and within the figure itself, there is no information on how to use projected variables for climate change impact assessment. Ideally this should be included as part of the example case. For example, would variables from GCMs require bias-correction before use, so that their observed-period statistics match those of the reanalysis data?
Line 116: Regarding Fig. 2, none of the units for the colour gradient or numbers are explained.
Line 117: Regarding Fig. 3, what are the gradient plots x and y axis showing?
Line 133: Re: Table 1, where possible the units of the various metrics should be shown, e.g. what are the units of the bias?
Line 136: You say "quantile mapping is inherently single variable", but the earlier description of SOM doesn't make it clear how it's applied jointly to multiple variables. It appears it's applied independently to precipitation and temperature, is that not the case?
Line 183: Re: Fig. 4, what do the dimensions of a SOM refer to? i.e. what are the horizontal and vertical axis representing?
Line 188: So choice of SOM size is somewhat subjective, based on expert knowledge, rather than any objective or automised optimisation? Is that a weakness?
Line 192: Qmap appears to perform best across the full distribution, is that the case and so worth mentioning?
Line 211: You say "SOM far outperforms the random forest on KS statistic", however RF two part is much closer. That is worth noting?
Line 234: Re: Fig. 7, given SOM underestimates lag-1 autocorrelation, does that mean it's unsuitable for extreme multi-day event precipitation and therefore, for example, flood applications?
Table 2 says "Random Forest does the best job of matching the day-to-day values, with the lowest RMSE", but actually Qmap has the lowest RMSE.
Citation: https://doi.org/10.5194/egusphere-2022-282-RC1 -
RC2: 'Comment on egusphere-2022-282', Anonymous Referee #2, 26 Jun 2022
The authors proposed an open source python package (namely CCdownscaling) for climate model downscaling. The package includes the self-organizing map (SOM), two random forest-based methods, and the quantile mapping approach, which have all been used for downscaling for many years. All the methods including SOM was trained and tested at one location in Chicago, Illinois. In general, the manuscript presents a case of software package development, but not geoscientific model development, which seems to not match with any one of the six aims of the GMD journal. It does not improve current downscaling methods or models, or present a detailed, rigorous descriptions, and evaluations of any new methods. I provide specific comments as follows.
Major issues:
- The novelty of this study is not justified. This study focuses on develop software package for downscaling by packing up a few existing empirical downscaling methods, but does not improve any downscaling techniques, or presenting any new method. We know SOM is an existing downscaling method that have been used for downscaling for many years. It is also not clear about the advantages of SOM compared to the other downscaling approaches. Is it worthwhile to develop such software package?
- There are several existing downscaling packages that did not consider in the manuscript, such as ClimDown in R: https://rdrr.io/cran/ClimDown/, and SBCK in both R and Python: https://github.com/yrobink/SBCK. These two packages include many downscaling and bias correction approaches that have been fully evaluated in the upstream research papers. What is the added value from this CCdownscaling package?
- It is not clear about the advantage of SOM method compared with other approaches for climate downscaling. The authors need to justify the value for developing a software package for SOM.
- The validation does not include any extreme events or indices. The validation results can be highly uncertain since it relies on the data from a single site. Sufficient number of sites covering different conditions are needed for robust evaluations.
- The downscaling method only considers four variables: relative humidity at 850hPa, air temperature at 850hPa, geopotential height at 850 hPa, two wind speed at 700 hPa, while so many physical covariates of precipitation are ignored. Note there are many precipitation covariates available in different reanalysis datasets as well as GCMs simulations.
Minor issues:
- In Section 4, the authors mentioned the SOM advantage of providing insight into the weather patterns giving rise to specific downscaled outcomes through pattern detection. But the explanation about pattern detection is not clear. Figures 2 and 3 are lacking context. The node example (0,4) is not clear. More notation and explanation are needed for pattern detection.
- The performance of SOM approach is similar to the basic quantile mapping approach (see Figure 6 and Figure 7 as well as the two tables). What could be the reason for that and the potential avenues to improve SOM performance?
- Computer code should be put in the appendix, not be part of the main text.
- Figure 6 is not a histogram
Citation: https://doi.org/10.5194/egusphere-2022-282-RC2 -
EC1: 'Comment on egusphere-2022-282', Jatin Kala, 04 Jul 2022
Although it is not usual that handling editors provide comments prior to the authors submitting a revised manuscript, in this instance, I think it will be useful for the authors in this instance to be aware of my overall assessment based on the manuscript, and the reports by the two anonymous reviewers.
I agree with the sentiment of both reviewers, especially reviewer 2 that this manuscript falls short on the criteria of Scientific Significance and Scientific Quality.
The manuscript in its present form, does not present substantial new concepts, ideas or methods in geoscientific model development. The package developed by the authors is no doubt useful and handy, but the novelty is not justified. On scientific quality, the analysis presented is more illustrative of what the package is able to do, rather than an in-depth analysis of different methods so as to better inform geoscientific model development.
In this instance, I would not recommend re-submission, unless the authors are prepared to carry out substantial more work addressing these issues. I hope the authors find this useful.
Citation: https://doi.org/10.5194/egusphere-2022-282-EC1
Data sets
CCdownscaling example use case data - O'Hare airport Andrew Polasky https://zenodo.org/record/6506677
Model code and software
CCdownscaling v1.0 Andrew Polasky https://zenodo.org/record/6506660
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
879 | 764 | 25 | 1,668 | 21 | 19 |
- HTML: 879
- PDF: 764
- XML: 25
- Total: 1,668
- BibTeX: 21
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1