the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
BHRR v1.0: a two-stage Transformer framework for simultaneous spatial restoration and quantile-function bias correction of climate model temperature fields
Abstract. Bias correction of climate model temperature fields in the image domain is difficult because general circulation model (GCM) outputs and observation-based references occupy different statistical distributions at each grid cell, so pixel-wise regression can recover spatial structure while leaving distributional biases intact. This study presents a two-stage Transformer framework, bias-corrected high-resolution restoration (BHRR), that addresses this problem by decoupling spatial restoration from distribution-aware bias correction. The framework is evaluated on daily near-surface temperature fields over a fixed 200×280 grid-point (latitude × longitude at 0.25° resolution) Oceania domain by sequentially coupling spatial restoration and distribution-aware bias correction. In the first stage, a Restormer model restores high-resolution spatial structure from linearly interpolated model fields. In the second stage, a Vision Transformer predicts a reference-based quantile map that is used as an explicit transfer function for equidistant cumulative distribution function (CDF) matching in future projections. Across daily minimum, mean, and maximum near-surface air temperature, the restoration stage improves spatial fidelity, increasing median structural similarity to 0.876–0.908 and median peak signal-to-noise ratio to 26.6–28.1 dB. The bias-correction stage further reduces systematic error, yielding near-zero median percent bias (<0.1%) and lowering median root mean square error (mean temperature by approximately 0.5 K and maximum temperature from 4.4 K to 3.7 K). To verify that the framework preserves climate-change signals rather than collapsing future projections toward historical climatology, future projections under SSP2-4.5 and SSP5-8.5 are examined using ETCCDI extreme indices and Sen's slope. The results confirm scenario-dependent differences in extreme-temperature diagnostics, and spatial-variability analysis shows patterns consistent with a standard downscaled benchmark, supporting the use of the BHRR v1.0 framework as a technical post-processing tool for distribution-aware bias correction of gridded climate fields.
- Preprint
(5543 KB) - Metadata XML
-
Supplement
(806 KB) - BibTeX
- EndNote
Status: open (until 15 Jul 2026)
-
RC1: 'Comment on egusphere-2026-1958', Anonymous Referee #1, 08 Jun 2026
reply
-
AC2: 'Reply on RC1', Eun-Sung Chung, 22 Jun 2026
reply
Dear Editor and Reviewer,
We sincerely thank the reviewer for the careful reading of our manuscript and for the constructive comments, which have helped us substantially improve the clarity and rigor of the paper.
Our detailed, point-by-point responses to all eleven comments are provided in the attached PDF. For each comment, we reproduce the reviewer's comment, give our response, and quote the corresponding revised manuscript text. Where relevant, we also list the references that have been added to strengthen the manuscript.
We hope that our responses adequately address all the concerns raised. We would be happy to provide any further clarification.
On behalf of all co-authors,
Young Hoon Song
-
AC2: 'Reply on RC1', Eun-Sung Chung, 22 Jun 2026
reply
-
CEC1: 'Comment on egusphere-2026-1958 - No compliance with the policy of the journal', Juan Antonio Añel, 21 Jun 2026
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
First, to access Restormer code you link a GitHub site. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. In addition, to access part of the data used in your work you cite a NASA site (NASA Earth Exchange Global Daily Downscaled Projections) and a paper that contains a link to hydrology,princeton.edu; however, none of them fulfil GMD’s requirements for a persistent data archive (actually the hidrology.princeton.edu site is not resolved anymore) because:
- They do not appear to have a published policy for data preservation over many years or decades (some flexibility exists over the precise length of preservation, but the policy must exist).
- They do not appear to have a published mechanism for preventing authors from unilaterally removing material. Archives must have a policy which makes removal of materials only possible in exceptional circumstances and subject to an independent curatorial decision,
- They do not appear to issue a persistent identifier such as a DOI or Handle for each precise dataset.If we have missed a published policy which does in fact address this matter satisfactorily, please post a response linking to it. If you have any questions about this issue, please post them in a reply.
The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. Please, therefore, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
Later, if the Topical Editor decides to continue with the review or publication process of your manuscript and you are requested to upload a new version of it, then The 'Code and Data Availability’ section of your manuscript must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2026-1958-CEC1 -
AC1: 'Reply on CEC1', Eun-Sung Chung, 22 Jun 2026
reply
Dear Dr. Añel,
Thank you for checking our manuscript against the GMD Code and Data Policy and for the opportunity to clarify. We fully share the goal of long-term, citable access to the code and data underlying the paper, and we believe the required materials are in fact already deposited in policy-compliant archives. We suspect the issue arose because our "Code and Data Availability" section cited several datasets only by their original upstream sources, without making clear that the exact subsets we used are also redeposited in our own Zenodo archives. We have now revised the text to remove this ambiguity, and we summarise the situation below.Two permanent Zenodo archives (DOIs) already exist and predate the discussion phase.
Data archive: https://doi.org/10.5281/zenodo.20152297 (published 13 May 2026; CC BY 4.0, with CMIP6-derived subfolders under CC BY-SA 4.0).
Code archive: https://doi.org/10.5281/zenodo.19441661 (published 6 April 2026; CC BY 4.0).Both are openly available, version-tagged (v1.0.0), and carry persistent DOIs issued by Zenodo, which provides a published long-term preservation policy and a curated-removal policy under the CERN Data Centre / InvenioRDM infrastructure.
PGFv3 reference data. The precise PGFv3 daily near-surface temperature subset (1980–2014) used for training and evaluation is redeposited inside the data archive (CC BY 4.0). Access therefore does not depend on the original hydrology.princeton.edu distribution; the Sheffield et al. (2006) citation is retained only as upstream provenance/attribution.
NEX-GDDP-CMIP6 benchmark. The exact ACCESS-CM2 NEX-GDDP-CMIP6 subset we compared against (historical, SSP2-4.5, and SSP5-8.5) is redeposited in the data archive (NASA Open Data; redistribution with attribution permitted). The NASA DOI (https://doi.org/10.7917/OFSG3345) is retained as provenance.
CMIP6 ACCESS-CM2 inputs. The regridded historical and raw future inputs used to drive BHRR are redeposited in the data archive, with the source simulations cited by their ESGF DOIs (10.22033/ESGF/CMIP6.4271, .4321, .4332) for provenance.
Restormer code. Our complete model implementation — including both the Restormer-based restoration stage and the ViT-based bias-correction stage — is contained in the archived source files within the code archive (DOI above), and the framework runs entirely from this archive without cloning any external repository. The Restormer architecture is credited to Zamir et al. (2022) through the normal in-text citation. To avoid any ambiguity with respect to the GMD policy on GitHub, we have removed all GitHub URLs from the revised Code Availability section, so that the Zenodo archive is the sole citable version of record.Accordingly, we have revised the "Data Availability" and "Code Availability" sections so that the two Zenodo DOIs are stated first; so that PGFv3, NEX-GDDP-CMIP6, and the CMIP6 inputs are explicitly identified as redeposited within the data archive (with upstream DOIs kept for attribution); and so that no GitHub link remains in these sections. The revised text is appended below for your convenience, and we will incorporate it, with the corresponding bibliography entries, in the next manuscript version.
If any element above does not fully satisfy the policy, we would be glad to take further steps (for example, depositing a tagged snapshot of the upstream Restormer backbone), and we welcome your guidance.
Thank you again for your careful review.
Sincerely,Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung (on behalf of the authors)
Data availability
The precise input datasets, trained model weights, and representative example outputs needed to reproduce the results of this study are permanently archived on Zenodo (Song et al., 2026b; https://doi.org/10.5281/zenodo.20152297) under CC BY 4.0, with the CMIP6-derived subfolders under CC BY-SA 4.0. To ensure access independent of the original upstream distributors, the exact subsets used here are redeposited within this archive, namely the Princeton Global Forcing v3 daily near-surface temperature reference (1980–2014; CC BY 4.0), the ACCESS-CM2 NEX-GDDP-CMIP6 v1.0 benchmark for the historical, SSP2-4.5, and SSP5-8.5 experiments (NASA Open Data, redistributed with attribution), and the ACCESS-CM2 historical and future temperature inputs used to drive BHRR.
For provenance and attribution, the upstream sources are the CMIP6 ACCESS-CM2 r1i1p1f1 v20191108 historical (Dix et al., 2019a; https://doi.org/10.22033/ESGF/CMIP6.4271), SSP245 (Dix et al., 2019b; https://doi.org/10.22033/ESGF/CMIP6.4321), and SSP585 (Dix et al., 2019c; https://doi.org/10.22033/ESGF/CMIP6.4332) simulations from the Earth System Grid Federation; the NEX-GDDP-CMIP6 v1.0 benchmark (Thrasher et al., 2022; https://doi.org/10.7917/OFSG3345); and the Princeton Global Forcing v3 dataset (Sheffield et al., 2006; https://doi.org/10.1175/JCLI3790.1). All resources were last accessed on 13 May 2026.
Code availability
The BHRR v1.0 framework is implemented in Python using PyTorch. The exact version used to produce all results in this paper is permanently archived on Zenodo under the CC BY 4.0 license (Song et al., 2026a; https://doi.org/10.5281/zenodo.19441661). The archive contains the complete training and inference scripts, the model definitions for both the Restormer-based spatial-restoration stage (following the Restormer architecture of Zamir et al., 2022) and the ViT-based quantile bias-correction stage, utility routines, and an example notebook for reproducing the main experiments. The framework can be run entirely from this archive without retrieving any external repository.
Citation: https://doi.org/10.5194/egusphere-2026-1958-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 22 Jun 2026
reply
Dear authors,
Many thanks for the clarifications. We can consider the current version of your manuscript in compliance with the policy of the journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2026-1958-CEC2
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 22 Jun 2026
reply
-
AC1: 'Reply on CEC1', Eun-Sung Chung, 22 Jun 2026
reply
Data sets
CMIP6 ACCESS-CM2 historical (CSIRO-ARCCSS) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4271
NASA NEX-GDDP-CMIP6: NASA Earth Exchange Global Daily Downscaled Projections for CMIP6 Bridget Thrasher et al. https://doi.org/10.7917/OFSG3345
Princeton Global Forcing version 3 (PGFv3) 0.25° daily data Justin Sheffield et al. https://doi.org/10.1175/JCLI3790.1
BHRR v1.0 – Data Archive (CMIP6 ACCESS-CM2 inputs, trained weights, and example outputs over Oceania) Young Hoon Song et al. https://doi.org/10.5281/zenodo.20152297
CMIP6 ACCESS-CM2 ssp245 (CSIRO-ARCCSS, ScenarioMIP) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4321
CMIP6 ACCESS-CM2 ssp585 (CSIRO-ARCCSS, ScenarioMIP) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4332
Model code and software
BHRR (v1.0): A two-stage Transformer framework for image-based bias correction and high-resolution restoration of climate model temperature fields Young Hoon Song et al. https://doi.org/10.5281/zenodo.19441661
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 157 | 54 | 15 | 226 | 17 | 12 | 16 |
- HTML: 157
- PDF: 54
- XML: 15
- Total: 226
- Supplement: 17
- BibTeX: 12
- EndNote: 16
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript presents a Transformer-based post-processing tool (BHRR) for the downscaling and bias correction of global climate models. The technique is applied to three temperature-related variables (daily mean/max/min near-surface air temperature) to show that BHRR was able to restore the detailed spatial patterns of the variables. Also, when driven by future climate predictions, BHRR was able to capture the signal of increasing temperature. Overall, there are several standing issues to this reviewer. The reviewer would withhold the recommendation for publication unless the following issues are adequately addressed.
1. This is what confuses the reviewer the most: BHRR is trained with ACCESS-CM2 data as the input and PGFv3 data as the target, both at the same resolution of 0.25 degrees. It seems to the reviewer that the BHRR tool has no downscaling capabilities of improving the resolution of the coarse data from climate models?
2. Following the previous question: since current global climate models can already perform simulations at 0.25 degree resolution, why would we need BHRR which produces post-processed outputs still in 0.25 degree resolution?
3. To improve structural clarity of the manuscript, please consider reorganizing the Introduction section. Any lines after line 122 can be moved to a later section: specifically, line 122-152 to the Methods section and line 153-169 to the Conclusions section. If the novelty of the work needs to be stressed in the Introduction section, make it concise.
4. In line 138: It is vague what “observation-based reference dataset” means. Does it mean a high-resolution regional assimilation dataset? If so, please be explicit in description.
5. In line 231: What does it mean by “bias correction in quantile-function space”. This description seems opaque to anyone not very familiar with the specific topic and practices. It will be helpful to elaborate on this for the broader audience.
6. In Figure 1: The squares before and after the “linear interpolation” stage should have the same overall size (with different grid resolution), right? For now, the way the figure is drawn obscures the idea being conveyed.
7. In Line 243: It is stated that “The restoration training workflow is summarized in Fig. 2”, but Figure 2 shows something different: Overview of ViT network architecture for bias correction.
8. In Line 302: For neural network inference task on a 200x280 grid, 15 seconds for a single inference seems extremely long. What might be the speed bottleneck here?
9. In Section 2.3.2: The term “quantile map” is extensively used in the text, but there lacks an explanation of it definition. Does it mean the statistical value distribution of the temperature variables on each grid of the domain? Is a “quantile-map” two-dimensional or three-dimensional, or more? Since this definition is central to the bias-correction section, it deserves a separate figure.
10. In Figure 3: Are the “Restorer” results predicted using input data not from the training dataset? This should be explicitly mentioned in the text.
11. Why do Figure 3 and Figure 4 employ different metrics for comparison: in Figure 3, SSIM and PSNR; in Figure 4, RMSE and PBIAS. Since the ViT Bias-Corrector is applied on top of the Restorer, shouldn’t the same metrics be employed to assess how much more improvement the ViT Bias-Corrector adds on top of the Restorer?