BHRR v1.0: a two-stage Transformer framework for simultaneous spatial restoration and quantile-function bias correction of climate model temperature fields
Abstract. Bias correction of climate model temperature fields in the image domain is difficult because general circulation model (GCM) outputs and observation-based references occupy different statistical distributions at each grid cell, so pixel-wise regression can recover spatial structure while leaving distributional biases intact. This study presents a two-stage Transformer framework, bias-corrected high-resolution restoration (BHRR), that addresses this problem by decoupling spatial restoration from distribution-aware bias correction. The framework is evaluated on daily near-surface temperature fields over a fixed 200×280 grid-point (latitude × longitude at 0.25° resolution) Oceania domain by sequentially coupling spatial restoration and distribution-aware bias correction. In the first stage, a Restormer model restores high-resolution spatial structure from linearly interpolated model fields. In the second stage, a Vision Transformer predicts a reference-based quantile map that is used as an explicit transfer function for equidistant cumulative distribution function (CDF) matching in future projections. Across daily minimum, mean, and maximum near-surface air temperature, the restoration stage improves spatial fidelity, increasing median structural similarity to 0.876–0.908 and median peak signal-to-noise ratio to 26.6–28.1 dB. The bias-correction stage further reduces systematic error, yielding near-zero median percent bias (<0.1%) and lowering median root mean square error (mean temperature by approximately 0.5 K and maximum temperature from 4.4 K to 3.7 K). To verify that the framework preserves climate-change signals rather than collapsing future projections toward historical climatology, future projections under SSP2-4.5 and SSP5-8.5 are examined using ETCCDI extreme indices and Sen's slope. The results confirm scenario-dependent differences in extreme-temperature diagnostics, and spatial-variability analysis shows patterns consistent with a standard downscaled benchmark, supporting the use of the BHRR v1.0 framework as a technical post-processing tool for distribution-aware bias correction of gridded climate fields.
This manuscript presents a Transformer-based post-processing tool (BHRR) for the downscaling and bias correction of global climate models. The technique is applied to three temperature-related variables (daily mean/max/min near-surface air temperature) to show that BHRR was able to restore the detailed spatial patterns of the variables. Also, when driven by future climate predictions, BHRR was able to capture the signal of increasing temperature. Overall, there are several standing issues to this reviewer. The reviewer would withhold the recommendation for publication unless the following issues are adequately addressed.
1. This is what confuses the reviewer the most: BHRR is trained with ACCESS-CM2 data as the input and PGFv3 data as the target, both at the same resolution of 0.25 degrees. It seems to the reviewer that the BHRR tool has no downscaling capabilities of improving the resolution of the coarse data from climate models?
2. Following the previous question: since current global climate models can already perform simulations at 0.25 degree resolution, why would we need BHRR which produces post-processed outputs still in 0.25 degree resolution?
3. To improve structural clarity of the manuscript, please consider reorganizing the Introduction section. Any lines after line 122 can be moved to a later section: specifically, line 122-152 to the Methods section and line 153-169 to the Conclusions section. If the novelty of the work needs to be stressed in the Introduction section, make it concise.
4. In line 138: It is vague what “observation-based reference dataset” means. Does it mean a high-resolution regional assimilation dataset? If so, please be explicit in description.
5. In line 231: What does it mean by “bias correction in quantile-function space”. This description seems opaque to anyone not very familiar with the specific topic and practices. It will be helpful to elaborate on this for the broader audience.
6. In Figure 1: The squares before and after the “linear interpolation” stage should have the same overall size (with different grid resolution), right? For now, the way the figure is drawn obscures the idea being conveyed.
7. In Line 243: It is stated that “The restoration training workflow is summarized in Fig. 2”, but Figure 2 shows something different: Overview of ViT network architecture for bias correction.
8. In Line 302: For neural network inference task on a 200x280 grid, 15 seconds for a single inference seems extremely long. What might be the speed bottleneck here?
9. In Section 2.3.2: The term “quantile map” is extensively used in the text, but there lacks an explanation of it definition. Does it mean the statistical value distribution of the temperature variables on each grid of the domain? Is a “quantile-map” two-dimensional or three-dimensional, or more? Since this definition is central to the bias-correction section, it deserves a separate figure.
10. In Figure 3: Are the “Restorer” results predicted using input data not from the training dataset? This should be explicitly mentioned in the text.
11. Why do Figure 3 and Figure 4 employ different metrics for comparison: in Figure 3, SSIM and PSNR; in Figure 4, RMSE and PBIAS. Since the ViT Bias-Corrector is applied on top of the Restorer, shouldn’t the same metrics be employed to assess how much more improvement the ViT Bias-Corrector adds on top of the Restorer?