CCdownscaling: an open-source Python package for multivariable statistical climate model downscaling V1.0&nbsp;

Polasky, Andrew D.; Evans, Jenni L.; Fuentes, Jose D.

doi:https://doi.org/10.5194/egusphere-2022-282

Preprints

Abstract
Assets
Discussion
Metrics

Preprints

https://doi.org/10.5194/egusphere-2022-282

Preprints

Abstract
Assets
Discussion
Metrics

06 May 2022

| 06 May 2022

Status: this preprint has been withdrawn by the authors.

CCdownscaling: an open-source Python package for multivariable statistical climate model downscaling V1.0

Andrew D. Polasky, Jenni L. Evans, and Jose D. Fuentes

Abstract. Statistical downscaling methods provide an essential bridge between low resolution global climate models and localized information needed by decision makers. As the demand for localized climate information continues to grow to make projections for a wide variety of applications, the need for software that can provide this sort of downscaled data grows with it. The CCdownscaling package described in the article provides a number of downscaling methods, including Self Organizing Maps, as well as a number of evaluation metrics for assessing downscale model skill. In this article, we describe the features of the CCdownscaling package, and show an example use case for downscaling temperature and precipitation. It is open-source and freely available for use in generating downscaled projections.

This preprint has been withdrawn.

How to cite. Polasky, A. D., Evans, J. L., and Fuentes, J. D.: CCdownscaling: an open-source Python package for multivariable statistical climate model downscaling V1.0 , EGUsphere [preprint], https://doi.org/10.5194/egusphere-2022-282, 2022.

Received: 03 May 2022 – Discussion started: 06 May 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 546 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (546 KB)

Download & links

This preprint has been withdrawn.

Preprint (546 KB)
Metadata XML
BibTeX
EndNote

Andrew D. Polasky, Jenni L. Evans, and Jose D. Fuentes

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2022-282', Anonymous Referee #1, 17 Jun 2022

A potentially useful software package that enables comparison of differnt statstistical downscaling results from multiple techniques with a focus on self-organising maps (SOM). As the package is adaptable to the incorporation of new and different machine learning methods, it would be useful to publish even given the limitations of the research example presented. Some general and technical comments that would improve the manuscript follow.

Line 72-73: No information is given on how or why the input variables for SOM training were selected. At leaset a reference to supporting work, that led to these selections, should be cited.

Line 83: In discussing Fig. 1, and within the figure itself, there is no information on how to use projected variables for climate change impact assessment. Ideally this should be included as part of the example case. For example, would variables from GCMs require bias-correction before use, so that their observed-period statistics match those of the reanalysis data?

Line 116: Regarding Fig. 2, none of the units for the colour gradient or numbers are explained.

Line 117: Regarding Fig. 3, what are the gradient plots x and y axis showing?

Line 133: Re: Table 1, where possible the units of the various metrics should be shown, e.g. what are the units of the bias?

Line 136: You say "quantile mapping is inherently single variable", but the earlier description of SOM doesn't make it clear how it's applied jointly to multiple variables. It appears it's applied independently to precipitation and temperature, is that not the case?

Line 183: Re: Fig. 4, what do the dimensions of a SOM refer to? i.e. what are the horizontal and vertical axis representing?

Line 188: So choice of SOM size is somewhat subjective, based on expert knowledge, rather than any objective or automised optimisation? Is that a weakness?

Line 192: Qmap appears to perform best across the full distribution, is that the case and so worth mentioning?

Line 211: You say "SOM far outperforms the random forest on KS statistic", however RF two part is much closer. That is worth noting?

Line 234: Re: Fig. 7, given SOM underestimates lag-1 autocorrelation, does that mean it's unsuitable for extreme multi-day event precipitation and therefore, for example, flood applications?

Table 2 says "Random Forest does the best job of matching the day-to-day values, with the lowest RMSE", but actually Qmap has the lowest RMSE.

Citation: https://doi.org/10.5194/egusphere-2022-282-RC1
RC2:
'Comment on egusphere-2022-282', Anonymous Referee #2, 26 Jun 2022
The authors proposed an open source python package (namely CCdownscaling) for climate model downscaling. The package includes the self-organizing map (SOM), two random forest-based methods, and the quantile mapping approach, which have all been used for downscaling for many years. All the methods including SOM was trained and tested at one location in Chicago, Illinois. In general, the manuscript presents a case of software package development, but not geoscientific model development, which seems to not match with any one of the six aims of the GMD journal. It does not improve current downscaling methods or models, or present a detailed, rigorous descriptions, and evaluations of any new methods. I provide specific comments as follows.

Major issues:

The novelty of this study is not justified. This study focuses on develop software package for downscaling by packing up a few existing empirical downscaling methods, but does not improve any downscaling techniques, or presenting any new method. We know SOM is an existing downscaling method that have been used for downscaling for many years. It is also not clear about the advantages of SOM compared to the other downscaling approaches. Is it worthwhile to develop such software package?

There are several existing downscaling packages that did not consider in the manuscript, such as ClimDown in R: https://rdrr.io/cran/ClimDown/, and SBCK in both R and Python: https://github.com/yrobink/SBCK. These two packages include many downscaling and bias correction approaches that have been fully evaluated in the upstream research papers. What is the added value from this CCdownscaling package?

It is not clear about the advantage of SOM method compared with other approaches for climate downscaling. The authors need to justify the value for developing a software package for SOM.

The validation does not include any extreme events or indices. The validation results can be highly uncertain since it relies on the data from a single site. Sufficient number of sites covering different conditions are needed for robust evaluations.

The downscaling method only considers four variables: relative humidity at 850hPa, air temperature at 850hPa, geopotential height at 850 hPa, two wind speed at 700 hPa, while so many physical covariates of precipitation are ignored. Note there are many precipitation covariates available in different reanalysis datasets as well as GCMs simulations.

Minor issues:

In Section 4, the authors mentioned the SOM advantage of providing insight into the weather patterns giving rise to specific downscaled outcomes through pattern detection. But the explanation about pattern detection is not clear. Figures 2 and 3 are lacking context. The node example (0,4) is not clear. More notation and explanation are needed for pattern detection.

The performance of SOM approach is similar to the basic quantile mapping approach (see Figure 6 and Figure 7 as well as the two tables). What could be the reason for that and the potential avenues to improve SOM performance?

Computer code should be put in the appendix, not be part of the main text.

Figure 6 is not a histogram
Citation: https://doi.org/10.5194/egusphere-2022-282-RC2
EC1: 'Comment on egusphere-2022-282', Jatin Kala, 04 Jul 2022

Although it is not usual that handling editors provide comments prior to the authors submitting a revised manuscript, in this instance, I think it will be useful for the authors in this instance to be aware of my overall assessment based on the manuscript, and the reports by the two anonymous reviewers.
I agree with the sentiment of both reviewers, especially reviewer 2 that this manuscript falls short on the criteria of Scientific Significance and Scientific Quality.
The manuscript in its present form, does not present substantial new concepts, ideas or methods in geoscientific model development. The package developed by the authors is no doubt useful and handy, but the novelty is not justified. On scientific quality, the analysis presented is more illustrative of what the package is able to do, rather than an in-depth analysis of different methods so as to better inform geoscientific model development.
In this instance, I would not recommend re-submission, unless the authors are prepared to carry out substantial more work addressing these issues. I hope the authors find this useful.

Citation: https://doi.org/10.5194/egusphere-2022-282-EC1

Report abuse

Please provide a reason why you see this comment as being abusive.
You might include your name and email but you can also stay anonymous.

Please provide a reason why you see this comment as being abusive.

Please confirm reCaptcha.

Comment*

Name:

Email:

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2022-282', Anonymous Referee #1, 17 Jun 2022

A potentially useful software package that enables comparison of differnt statstistical downscaling results from multiple techniques with a focus on self-organising maps (SOM). As the package is adaptable to the incorporation of new and different machine learning methods, it would be useful to publish even given the limitations of the research example presented. Some general and technical comments that would improve the manuscript follow.

Line 72-73: No information is given on how or why the input variables for SOM training were selected. At leaset a reference to supporting work, that led to these selections, should be cited.

Line 83: In discussing Fig. 1, and within the figure itself, there is no information on how to use projected variables for climate change impact assessment. Ideally this should be included as part of the example case. For example, would variables from GCMs require bias-correction before use, so that their observed-period statistics match those of the reanalysis data?

Line 116: Regarding Fig. 2, none of the units for the colour gradient or numbers are explained.

Line 117: Regarding Fig. 3, what are the gradient plots x and y axis showing?

Line 133: Re: Table 1, where possible the units of the various metrics should be shown, e.g. what are the units of the bias?

Line 136: You say "quantile mapping is inherently single variable", but the earlier description of SOM doesn't make it clear how it's applied jointly to multiple variables. It appears it's applied independently to precipitation and temperature, is that not the case?

Line 183: Re: Fig. 4, what do the dimensions of a SOM refer to? i.e. what are the horizontal and vertical axis representing?

Line 188: So choice of SOM size is somewhat subjective, based on expert knowledge, rather than any objective or automised optimisation? Is that a weakness?

Line 192: Qmap appears to perform best across the full distribution, is that the case and so worth mentioning?

Line 211: You say "SOM far outperforms the random forest on KS statistic", however RF two part is much closer. That is worth noting?

Line 234: Re: Fig. 7, given SOM underestimates lag-1 autocorrelation, does that mean it's unsuitable for extreme multi-day event precipitation and therefore, for example, flood applications?

Table 2 says "Random Forest does the best job of matching the day-to-day values, with the lowest RMSE", but actually Qmap has the lowest RMSE.

Citation: https://doi.org/10.5194/egusphere-2022-282-RC1
RC2:
'Comment on egusphere-2022-282', Anonymous Referee #2, 26 Jun 2022
The authors proposed an open source python package (namely CCdownscaling) for climate model downscaling. The package includes the self-organizing map (SOM), two random forest-based methods, and the quantile mapping approach, which have all been used for downscaling for many years. All the methods including SOM was trained and tested at one location in Chicago, Illinois. In general, the manuscript presents a case of software package development, but not geoscientific model development, which seems to not match with any one of the six aims of the GMD journal. It does not improve current downscaling methods or models, or present a detailed, rigorous descriptions, and evaluations of any new methods. I provide specific comments as follows.

Major issues:

The novelty of this study is not justified. This study focuses on develop software package for downscaling by packing up a few existing empirical downscaling methods, but does not improve any downscaling techniques, or presenting any new method. We know SOM is an existing downscaling method that have been used for downscaling for many years. It is also not clear about the advantages of SOM compared to the other downscaling approaches. Is it worthwhile to develop such software package?

There are several existing downscaling packages that did not consider in the manuscript, such as ClimDown in R: https://rdrr.io/cran/ClimDown/, and SBCK in both R and Python: https://github.com/yrobink/SBCK. These two packages include many downscaling and bias correction approaches that have been fully evaluated in the upstream research papers. What is the added value from this CCdownscaling package?

It is not clear about the advantage of SOM method compared with other approaches for climate downscaling. The authors need to justify the value for developing a software package for SOM.

The validation does not include any extreme events or indices. The validation results can be highly uncertain since it relies on the data from a single site. Sufficient number of sites covering different conditions are needed for robust evaluations.

The downscaling method only considers four variables: relative humidity at 850hPa, air temperature at 850hPa, geopotential height at 850 hPa, two wind speed at 700 hPa, while so many physical covariates of precipitation are ignored. Note there are many precipitation covariates available in different reanalysis datasets as well as GCMs simulations.

Minor issues:

In Section 4, the authors mentioned the SOM advantage of providing insight into the weather patterns giving rise to specific downscaled outcomes through pattern detection. But the explanation about pattern detection is not clear. Figures 2 and 3 are lacking context. The node example (0,4) is not clear. More notation and explanation are needed for pattern detection.

The performance of SOM approach is similar to the basic quantile mapping approach (see Figure 6 and Figure 7 as well as the two tables). What could be the reason for that and the potential avenues to improve SOM performance?

Computer code should be put in the appendix, not be part of the main text.

Figure 6 is not a histogram
Citation: https://doi.org/10.5194/egusphere-2022-282-RC2
EC1: 'Comment on egusphere-2022-282', Jatin Kala, 04 Jul 2022

Although it is not usual that handling editors provide comments prior to the authors submitting a revised manuscript, in this instance, I think it will be useful for the authors in this instance to be aware of my overall assessment based on the manuscript, and the reports by the two anonymous reviewers.
I agree with the sentiment of both reviewers, especially reviewer 2 that this manuscript falls short on the criteria of Scientific Significance and Scientific Quality.
The manuscript in its present form, does not present substantial new concepts, ideas or methods in geoscientific model development. The package developed by the authors is no doubt useful and handy, but the novelty is not justified. On scientific quality, the analysis presented is more illustrative of what the package is able to do, rather than an in-depth analysis of different methods so as to better inform geoscientific model development.
In this instance, I would not recommend re-submission, unless the authors are prepared to carry out substantial more work addressing these issues. I hope the authors find this useful.

Citation: https://doi.org/10.5194/egusphere-2022-282-EC1

Report abuse

Please provide a reason why you see this comment as being abusive.
You might include your name and email but you can also stay anonymous.

Please provide a reason why you see this comment as being abusive.

Please confirm reCaptcha.

Comment*

Name:

Email:

Andrew D. Polasky, Jenni L. Evans, and Jose D. Fuentes

Data sets

CCdownscaling example use case data - O'Hare airport Andrew Polasky https://zenodo.org/record/6506677

Model code and software

CCdownscaling v1.0 Andrew Polasky https://zenodo.org/record/6506660

Andrew D. Polasky, Jenni L. Evans, and Jose D. Fuentes

Viewed

Total article views: 1,746 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
903	813	30	1,746	26	25

HTML: 903
PDF: 813
XML: 30
Total: 1,746
BibTeX: 26
EndNote: 25

Views and downloads (calculated since 06 May 2022)

Cumulative views and downloads (calculated since 06 May 2022)

Viewed (geographical distribution)

Total article views: 1,676 (including HTML, PDF, and XML) Thereof 1,676 with geography defined and 0 with unknown origin.

Country	#	Views	%
China	1	441	26
United States of America	2	320	19
Germany	3	125	7
India	4	58	3
United Kingdom	5	42	2


Total:	0
HTML:	0
PDF:	0
XML:	0

441

Latest update: 08 Apr 2025

Andrew D. Polasky

Department of Meteorology and Atmospheric Science, The Pennsylvania State University

Jenni L. Evans

CORRESPONDING AUTHOR

jle7@psu.edu

Department of Meteorology and Atmospheric Science, The Pennsylvania State University

Institute for Computational and Data Sciences, The Pennsylvania State University, University Park

Jose D. Fuentes

Department of Meteorology and Atmospheric Science, The Pennsylvania State University

Download

This preprint has been withdrawn.

Preprint (546 KB)
Metadata XML

BibTeX
EndNote

Short summary

Statistical downscaling provides methods to bridge the gap between the global climate models and the scale of information needed to understand the impacts of climate change. This paper describes a new software package that provides a number of statistical downscaling approaches, as well as evaluation metrics for these methods. The goal of this work is to provide a new tool for researchers carrying out downscaling studies, and enable the easy use and comparison of different downscaling methods.

Statistical downscaling provides methods to bridge the gap between the global climate models and...