climQMBC: A package with multiple bias correction methods of GCM climatic variables at daily, monthly and annual scale, developed in Python, R and MATLAB
Abstract. Climate change projections are studied using General Circulation Models (GCMs). GCMs are models that simulate climate on a broad scale, hence they cannot be directly used in local impact studies, such as, for example, hydrological studies. GCMs must go through a process of downscaling, to adjust their results in terms of spatial scale and reduce their bias before being used at the local scale. Quantile Mapping is one of the most widely used approaches for bias correcting GCM climate outputs. However, in its conventional formulation QM assumes a time-invariant correction function, which potentially results in additional biases. This has motivated the development of trend-preserving variations, accounting for a non-stationary correction function and aiming to preserve the raw GCM signal. Unfortunately, choosing which variation to use is not straight-forward. We present the climQMBC package (https://github.com/saedoquililongo/climQMBC or https://doi.org/10.5281/zenodo.18392900) as an easy-to-use tool to compare quantile mapping approaches. climQMBC is available in Python, R and MATLAB, and contains the classic QM method and four trend-preserving variations: Detrended Quantile Mapping (DQM), Quantile Delta Mapping (QDM), Unbiased Quantile Mapping (UQM) and Scaled Distribution Mapping (SDM). This package has a built-in summary report that allows comparing methods in terms of their capability of preserving raw GCM trends. A synthetic exercise showed that the most reliable methods are the UQM and DQM.
This article presents a software package that consolidates multiple quantile mapping methods into one common interface, and makes it available in the three most common coding languages used by climate scientists and others who use climate projections. While limited to quantile mapping type approaches, this will be a useful software package for many people exploring climate impacts on a local or regional scale. My suggestions below are largely aimed at providing a more complete context for this package relative to available resources and methods.
1) Line 34, following the sentence on RCMs, it might be worth mentioning that there are also hybrid methods (e.g., Walton et al., 2015, https://doi.org/10.1175/JCLI-D-14-00196.1) that combine dynamical and statistical downscaling.
2) Lines 34-36, While the software being presented is only for QM variations, the sentence listing “several bias correction (or bias adjustment) methods” should be a little more explicit in what these consist of (like analogues). You might also mention that analogues would not be amenable to the approach of this software, where all methods essentially work on point-to-point bias correction. Also, current development of machine-learning approaches to downscaling is emerging (e.g., Soares et al., 2024, https://doi.org/10.5194/gmd-17-229-2024) and should be noted.
3) Lines 58-61, a little more discussion would help characterize the uncertainties. The Lafferty et al. (2023) reference would provide a good framing of this (https://doi.org/10.1038/s41612-023-00486-0).
4) Line 85, Add a short paragraph discussing the availability of downscaled data, which is increasingly being used by stakeholders to avoid having to do downscaling at all. A few examples of CMIP6 global downscaled data sets are those from the Climate Impacts Lab (Gergel et al., 2024, https://doi.org/10.5194/gmd-17-191-2024), School of Geography and Environmental Science (2022, https://doi.org/10.5285/c107618f1db34801bb88a1e927b82317) and the NASA-NEX archive (Thrasher et al., 2022, https://doi.org/10.1038/s41597-022-01393-4). At regional/continental scales there are many more data sets. Of course, each uses its own observational baseline data, training period, etc., to attributing differences to specific sources is not possible, which means there is still value in facilitating statistical downscaling.
5) Line 285, the example script (Figure 4) and the linked example notebook on github are helpful illustrations of how to use the package for a point. It may exist on the github site but I could not find an example of applying this to a gridded (e.g., netCDF) data set. Since that would be a relatively common application, developing an example of that would make this more complete.