BHRR v1.0: a two-stage Transformer framework for simultaneous spatial restoration and quantile-function bias correction of climate model temperature fields

Song, Young Hoon; Kim, Hyung Ju; Chung, Eun-Sung

doi:10.5194/egusphere-2026-1958

Preprints

https://doi.org/10.5194/egusphere-2026-1958

Preprints

20 May 2026

| 20 May 2026

BHRR v1.0: a two-stage Transformer framework for simultaneous spatial restoration and quantile-function bias correction of climate model temperature fields

Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung

Abstract. Bias correction of climate model temperature fields in the image domain is difficult because general circulation model (GCM) outputs and observation-based references occupy different statistical distributions at each grid cell, so pixel-wise regression can recover spatial structure while leaving distributional biases intact. This study presents a two-stage Transformer framework, bias-corrected high-resolution restoration (BHRR), that addresses this problem by decoupling spatial restoration from distribution-aware bias correction. The framework is evaluated on daily near-surface temperature fields over a fixed 200×280 grid-point (latitude × longitude at 0.25° resolution) Oceania domain by sequentially coupling spatial restoration and distribution-aware bias correction. In the first stage, a Restormer model restores high-resolution spatial structure from linearly interpolated model fields. In the second stage, a Vision Transformer predicts a reference-based quantile map that is used as an explicit transfer function for equidistant cumulative distribution function (CDF) matching in future projections. Across daily minimum, mean, and maximum near-surface air temperature, the restoration stage improves spatial fidelity, increasing median structural similarity to 0.876–0.908 and median peak signal-to-noise ratio to 26.6–28.1 dB. The bias-correction stage further reduces systematic error, yielding near-zero median percent bias (<0.1%) and lowering median root mean square error (mean temperature by approximately 0.5 K and maximum temperature from 4.4 K to 3.7 K). To verify that the framework preserves climate-change signals rather than collapsing future projections toward historical climatology, future projections under SSP2-4.5 and SSP5-8.5 are examined using ETCCDI extreme indices and Sen's slope. The results confirm scenario-dependent differences in extreme-temperature diagnostics, and spatial-variability analysis shows patterns consistent with a standard downscaled benchmark, supporting the use of the BHRR v1.0 framework as a technical post-processing tool for distribution-aware bias correction of gridded climate fields.

Received: 08 Apr 2026 – Discussion started: 20 May 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 5543 KB)

Supplement (806 KB)

Download & links

Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-1958', Anonymous Referee #1, 08 Jun 2026

This manuscript presents a Transformer-based post-processing tool (BHRR) for the downscaling and bias correction of global climate models. The technique is applied to three temperature-related variables (daily mean/max/min near-surface air temperature) to show that BHRR was able to restore the detailed spatial patterns of the variables. Also, when driven by future climate predictions, BHRR was able to capture the signal of increasing temperature. Overall, there are several standing issues to this reviewer. The reviewer would withhold the recommendation for publication unless the following issues are adequately addressed.
1. This is what confuses the reviewer the most: BHRR is trained with ACCESS-CM2 data as the input and PGFv3 data as the target, both at the same resolution of 0.25 degrees. It seems to the reviewer that the BHRR tool has no downscaling capabilities of improving the resolution of the coarse data from climate models?
2. Following the previous question: since current global climate models can already perform simulations at 0.25 degree resolution, why would we need BHRR which produces post-processed outputs still in 0.25 degree resolution?
3. To improve structural clarity of the manuscript, please consider reorganizing the Introduction section. Any lines after line 122 can be moved to a later section: specifically, line 122-152 to the Methods section and line 153-169 to the Conclusions section. If the novelty of the work needs to be stressed in the Introduction section, make it concise.
4. In line 138: It is vague what “observation-based reference dataset” means. Does it mean a high-resolution regional assimilation dataset? If so, please be explicit in description.
5. In line 231: What does it mean by “bias correction in quantile-function space”. This description seems opaque to anyone not very familiar with the specific topic and practices. It will be helpful to elaborate on this for the broader audience.
6. In Figure 1: The squares before and after the “linear interpolation” stage should have the same overall size (with different grid resolution), right? For now, the way the figure is drawn obscures the idea being conveyed.
7. In Line 243: It is stated that “The restoration training workflow is summarized in Fig. 2”, but Figure 2 shows something different: Overview of ViT network architecture for bias correction.
8. In Line 302: For neural network inference task on a 200x280 grid, 15 seconds for a single inference seems extremely long. What might be the speed bottleneck here?
9. In Section 2.3.2: The term “quantile map” is extensively used in the text, but there lacks an explanation of it definition. Does it mean the statistical value distribution of the temperature variables on each grid of the domain? Is a “quantile-map” two-dimensional or three-dimensional, or more? Since this definition is central to the bias-correction section, it deserves a separate figure.
10. In Figure 3: Are the “Restorer” results predicted using input data not from the training dataset? This should be explicitly mentioned in the text.
11. Why do Figure 3 and Figure 4 employ different metrics for comparison: in Figure 3, SSIM and PSNR; in Figure 4, RMSE and PBIAS. Since the ViT Bias-Corrector is applied on top of the Restorer, shouldn’t the same metrics be employed to assess how much more improvement the ViT Bias-Corrector adds on top of the Restorer?

Citation: https://doi.org/10.5194/egusphere-2026-1958-RC1
- AC2: 'Reply on RC1', Eun-Sung Chung, 22 Jun 2026
  
  Dear Editor and Reviewer,
  We sincerely thank the reviewer for the careful reading of our manuscript and for the constructive comments, which have helped us substantially improve the clarity and rigor of the paper.
  Our detailed, point-by-point responses to all eleven comments are provided in the attached PDF. For each comment, we reproduce the reviewer's comment, give our response, and quote the corresponding revised manuscript text. Where relevant, we also list the references that have been added to strengthen the manuscript.
  We hope that our responses adequately address all the concerns raised. We would be happy to provide any further clarification.
  On behalf of all co-authors,
  Young Hoon Song
  
  Citation: https://doi.org/10.5194/egusphere-2026-1958-AC2
CEC1:
'Comment on egusphere-2026-1958 - No compliance with the policy of the journal', Juan Antonio Añel, 21 Jun 2026

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
First, to access Restormer code you link a GitHub site. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. In addition, to access part of the data used in your work you cite a NASA site (NASA Earth Exchange Global Daily Downscaled Projections) and a paper that contains a link to hydrology,princeton.edu; however, none of them fulfil GMD’s requirements for a persistent data archive (actually the hidrology.princeton.edu site is not resolved anymore) because:
- They do not appear to have a published policy for data preservation over many years or decades (some flexibility exists over the precise length of preservation, but the policy must exist).

- They do not appear to have a published mechanism for preventing authors from unilaterally removing material. Archives must have a policy which makes removal of materials only possible in exceptional circumstances and subject to an independent curatorial decision,

- They do not appear to issue a persistent identifier such as a DOI or Handle for each precise dataset.
If we have missed a published policy which does in fact address this matter satisfactorily, please post a response linking to it. If you have any questions about this issue, please post them in a reply.
The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. Please, therefore, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
Later, if the Topical Editor decides to continue with the review or publication process of your manuscript and you are requested to upload a new version of it, then The 'Code and Data Availability’ section of your manuscript must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2026-1958-CEC1
- AC1:
  'Reply on CEC1', Eun-Sung Chung, 22 Jun 2026
  
  Dear Dr. Añel,
  
  Thank you for checking our manuscript against the GMD Code and Data Policy and for the opportunity to clarify. We fully share the goal of long-term, citable access to the code and data underlying the paper, and we believe the required materials are in fact already deposited in policy-compliant archives. We suspect the issue arose because our "Code and Data Availability" section cited several datasets only by their original upstream sources, without making clear that the exact subsets we used are also redeposited in our own Zenodo archives. We have now revised the text to remove this ambiguity, and we summarise the situation below.
  Two permanent Zenodo archives (DOIs) already exist and predate the discussion phase.
  
  Data archive: https://doi.org/10.5281/zenodo.20152297 (published 13 May 2026; CC BY 4.0, with CMIP6-derived subfolders under CC BY-SA 4.0).
  
  Code archive: https://doi.org/10.5281/zenodo.19441661 (published 6 April 2026; CC BY 4.0).
  Both are openly available, version-tagged (v1.0.0), and carry persistent DOIs issued by Zenodo, which provides a published long-term preservation policy and a curated-removal policy under the CERN Data Centre / InvenioRDM infrastructure.
  PGFv3 reference data. The precise PGFv3 daily near-surface temperature subset (1980–2014) used for training and evaluation is redeposited inside the data archive (CC BY 4.0). Access therefore does not depend on the original hydrology.princeton.edu distribution; the Sheffield et al. (2006) citation is retained only as upstream provenance/attribution.
  
  NEX-GDDP-CMIP6 benchmark. The exact ACCESS-CM2 NEX-GDDP-CMIP6 subset we compared against (historical, SSP2-4.5, and SSP5-8.5) is redeposited in the data archive (NASA Open Data; redistribution with attribution permitted). The NASA DOI (https://doi.org/10.7917/OFSG3345) is retained as provenance.
  
  CMIP6 ACCESS-CM2 inputs. The regridded historical and raw future inputs used to drive BHRR are redeposited in the data archive, with the source simulations cited by their ESGF DOIs (10.22033/ESGF/CMIP6.4271, .4321, .4332) for provenance.
  
  Restormer code. Our complete model implementation — including both the Restormer-based restoration stage and the ViT-based bias-correction stage — is contained in the archived source files within the code archive (DOI above), and the framework runs entirely from this archive without cloning any external repository. The Restormer architecture is credited to Zamir et al. (2022) through the normal in-text citation. To avoid any ambiguity with respect to the GMD policy on GitHub, we have removed all GitHub URLs from the revised Code Availability section, so that the Zenodo archive is the sole citable version of record.
  Accordingly, we have revised the "Data Availability" and "Code Availability" sections so that the two Zenodo DOIs are stated first; so that PGFv3, NEX-GDDP-CMIP6, and the CMIP6 inputs are explicitly identified as redeposited within the data archive (with upstream DOIs kept for attribution); and so that no GitHub link remains in these sections. The revised text is appended below for your convenience, and we will incorporate it, with the corresponding bibliography entries, in the next manuscript version.
  
  If any element above does not fully satisfy the policy, we would be glad to take further steps (for example, depositing a tagged snapshot of the upstream Restormer backbone), and we welcome your guidance.
  
  Thank you again for your careful review.
  
  Sincerely,
  Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung (on behalf of the authors)
  Data availability
  The precise input datasets, trained model weights, and representative example outputs needed to reproduce the results of this study are permanently archived on Zenodo (Song et al., 2026b; https://doi.org/10.5281/zenodo.20152297) under CC BY 4.0, with the CMIP6-derived subfolders under CC BY-SA 4.0. To ensure access independent of the original upstream distributors, the exact subsets used here are redeposited within this archive, namely the Princeton Global Forcing v3 daily near-surface temperature reference (1980–2014; CC BY 4.0), the ACCESS-CM2 NEX-GDDP-CMIP6 v1.0 benchmark for the historical, SSP2-4.5, and SSP5-8.5 experiments (NASA Open Data, redistributed with attribution), and the ACCESS-CM2 historical and future temperature inputs used to drive BHRR.
  For provenance and attribution, the upstream sources are the CMIP6 ACCESS-CM2 r1i1p1f1 v20191108 historical (Dix et al., 2019a; https://doi.org/10.22033/ESGF/CMIP6.4271), SSP245 (Dix et al., 2019b; https://doi.org/10.22033/ESGF/CMIP6.4321), and SSP585 (Dix et al., 2019c; https://doi.org/10.22033/ESGF/CMIP6.4332) simulations from the Earth System Grid Federation; the NEX-GDDP-CMIP6 v1.0 benchmark (Thrasher et al., 2022; https://doi.org/10.7917/OFSG3345); and the Princeton Global Forcing v3 dataset (Sheffield et al., 2006; https://doi.org/10.1175/JCLI3790.1). All resources were last accessed on 13 May 2026.
  Code availability
  The BHRR v1.0 framework is implemented in Python using PyTorch. The exact version used to produce all results in this paper is permanently archived on Zenodo under the CC BY 4.0 license (Song et al., 2026a; https://doi.org/10.5281/zenodo.19441661). The archive contains the complete training and inference scripts, the model definitions for both the Restormer-based spatial-restoration stage (following the Restormer architecture of Zamir et al., 2022) and the ViT-based quantile bias-correction stage, utility routines, and an example notebook for reproducing the main experiments. The framework can be run entirely from this archive without retrieving any external repository.
  
  Citation: https://doi.org/10.5194/egusphere-2026-1958-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 22 Jun 2026
    
    Dear authors,
    Many thanks for the clarifications. We can consider the current version of your manuscript in compliance with the policy of the journal.
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2026-1958-CEC2
RC2:
'Comment on egusphere-2026-1958', Anonymous Referee #2, 06 Jul 2026

Please see the attachment for detailed comments.

Citation: https://doi.org/10.5194/egusphere-2026-1958-RC2
- AC3: 'Reply on RC2', Eun-Sung Chung, 24 Jul 2026
  
  Dear Referee #2,
  We sincerely thank you for the thorough and constructive review of our manuscript (egusphere-2026-1958). Your comments have substantially improved the methodological clarity, the evaluation of climate-change signal preservation, and the discussion of generalizability, and we are grateful for the time and care you devoted to the assessment.
  We have carefully addressed all of your major and minor comments. A detailed point-by-point response, together with the corresponding revised manuscript text and the added references, is provided in the attached PDF. Your comments are reproduced in full and are followed by our replies and the exact revisions made.
  In brief, this revision clarifies the methodology of both stages, including the bilinear interpolation choice and its limited effect on the restored field, and the quantile-function translation performed by the Vision Transformer. We add a direct comparison against a non-learned empirical quantile-mapping baseline, which shows comparable distributional accuracy while the learned translation additionally regularizes the spatially noisy per-cell quantiles and improves spatial coherence in the tails. We quantify the preservation of the future climate-change signal against the raw ACCESS-CM2 driving model, showing that BHRR reproduces the driving-model warming trend to within roughly 5 to 10 percent across variables and scenarios and preserves the scenario ordering and the spatial pattern of change. We also expand the interpretation of the results and the discussion of transferability and the out-of-distribution assumption, and we correct the table numbering, define all metrics at first use, and add training diagnostics for both stages.
  The changes described above have been incorporated into the revised manuscript, which will be uploaded separately as instructed. We believe these revisions have substantially strengthened the manuscript, and we hope that it is now suitable for publication in Geoscientific Model Development. We remain happy to provide any further clarification.
  Sincerely,
  Eun-Sung Chung
  
  Citation: https://doi.org/10.5194/egusphere-2026-1958-AC3

Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung

Supplement

https://doi.org/10.5194/egusphere-2026-1958-supplement

Data sets

CMIP6 ACCESS-CM2 historical (CSIRO-ARCCSS) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4271

NASA NEX-GDDP-CMIP6: NASA Earth Exchange Global Daily Downscaled Projections for CMIP6 Bridget Thrasher et al. https://doi.org/10.7917/OFSG3345

Princeton Global Forcing version 3 (PGFv3) 0.25° daily data Justin Sheffield et al. https://doi.org/10.1175/JCLI3790.1

BHRR v1.0 – Data Archive (CMIP6 ACCESS-CM2 inputs, trained weights, and example outputs over Oceania) Young Hoon Song et al. https://doi.org/10.5281/zenodo.20152297

CMIP6 ACCESS-CM2 ssp245 (CSIRO-ARCCSS, ScenarioMIP) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4321

CMIP6 ACCESS-CM2 ssp585 (CSIRO-ARCCSS, ScenarioMIP) Martin Dix et al. https://doi.org/10.22033/ESGF/CMIP6.4332

Model code and software

BHRR (v1.0): A two-stage Transformer framework for image-based bias correction and high-resolution restoration of climate model temperature fields Young Hoon Song et al. https://doi.org/10.5281/zenodo.19441661

Young Hoon Song, Hyung Ju Kim, and Eun-Sung Chung

Viewed

Total article views: 277 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
186	73	18	277	20	14	17

HTML: 186
PDF: 73
XML: 18
Total: 277
Supplement: 20
BibTeX: 14
EndNote: 17

Views and downloads (calculated since 20 May 2026)

Month	HTML	PDF	XML	Total
May 2026	132	45	13	190
Jun 2026	25	9	2	36
Jul 2026	29	19	3	51

Cumulative views and downloads (calculated since 20 May 2026)

Month	HTML	PDF	XML	Total
May 2026	132	45	13	190
Jun 2026	25	9	2	36
Jul 2026	29	19	3	51

Viewed (geographical distribution)

Total article views: 256 (including HTML, PDF, and XML) Thereof 256 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 25 Jul 2026

Short summary

Climate models project future temperature but contain biases and lack spatial detail. This study develops Bias-corrected High-Resolution Restoration (BHRR v1.0), a deep-learning framework that restores high-resolution patterns from raw climate model temperature fields and corrects distributions via quantile-function mapping. Over Oceania, BHRR improves spatial accuracy, reduces errors, and preserves climate-change signals. This open-source tool enables reliable high-resolution temperature data.


Total:	0
HTML:	0
PDF:	0
XML:	0