Conditional diffusion models for downscaling &amp; bias correction of Earth system model precipitation

Aich, Michael; Hess, Philipp; Pan, Baoxiang; Bathiany, Sebastian; Huang, Yu; Boers, Niklas

doi:10.5194/egusphere-2025-2646

Preprints

https://doi.org/10.5194/egusphere-2025-2646

Preprints

27 Jun 2025

| 27 Jun 2025

Conditional diffusion models for downscaling & bias correction of Earth system model precipitation

Michael Aich, Philipp Hess, Baoxiang Pan, Sebastian Bathiany, Yu Huang, and Niklas Boers

Abstract. Climate change exacerbates extreme weather events like heavy rainfall and flooding. As these events cause severe socioeconomic damages, accurate high-resolution simulation of precipitation is imperative. However, existing Earth System Models (ESMs) struggle resolving small-scale dynamics and suffer from biases. Traditional statistical bias correction and downscaling methods fall short in improving spatial structure, while recent deep learning methods lack controllability and suffer from unstable training. Here, we propose a machine learning framework for simultaneous bias correction and downscaling. We train a generative diffusion model purely on observational data. We map observational and ESM data to a shared embedding space, where both are unbiased towards each other and train a conditional diffusion model to reverse the mapping. Our method can correct any ESM field, as the training is independent of the ESM. Our approach ensures statistical fidelity, preserves large-scale spatial patterns and outperforms existing methods especially regarding extreme events.

Received: 04 Jun 2025 – Discussion started: 27 Jun 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 11889 KB)

Supplement (15045 KB)

Download & links

Preprint (11889 KB)
Metadata XML
Supplement (15045 KB)
BibTeX
EndNote

Michael Aich, Philipp Hess, Baoxiang Pan, Sebastian Bathiany, Yu Huang, and Niklas Boers

Status: final response (author comments only)

CEC1:
'Comment on egusphere-2025-2646 - No compliance with the policy of the journal', Juan Antonio Añel, 24 Jul 2025

Dear authors,

Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html.
In your manuscript you do not provide repositories for the code and data used in your study. You simply provide a number of links, that in the case of the Zenodo repository for the code is empty, and for the data, only points out to main webpages that do not contain the specific data/variables used in your work. Also, you do not provide a repository with the output data. Therefore, the current situation with your manuscript is irregular, and it should have not been accepted in Discussions or for peer-review because of the above mentioned issues. Please, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy.
Also, you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript, containing the information of the new repositories.
I must note that if you do not fix this problem, we can not continue with the peer-review process or accept your manuscript for publication in our journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2025-2646-CEC1
- AC1:
  'Reply on CEC1', Michael Aich, 30 Jul 2025
  
  Dear Editor,
  we apologize for the missing code and data uploads. We now made both public:
  
  Code: https://github.com/aim56009/ESM_cdifffusion_downscaling/
  
  Input data: https://doi.org/10.5281/zenodo.16610901
  
  Model data: https://doi.org/10.5281/zenodo.16610050
  
  Output data: https://doi.org/10.5281/zenodo.14849653
  
  We will also adjust our 'Code and Data Availability' section accordingly.
  Best regards
  
  Michael Aich
  
  Citation: https://doi.org/10.5194/egusphere-2025-2646-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 30 Jul 2025
    
    Dear authors,
    Unfortunately, again, your reply fails to comply with our policy. I would ask you to read it carefully before replying again with something that does not comply with it.
    You have provided the code hosted in a git site. Git sites are not acceptable for scientific publication. Please, store your code in one of the repositories acceptable according to our policy.
    Additionally, your implementation relies on the use of external software, a number of libraries. In this case, to assure the replicability of your work, please, clarify which is the version number of the libraries that you have used for your work, and the Python interpreter you use.
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2025-2646-CEC2
    
    AC2: 'Reply on CEC2', Michael Aich, 31 Jul 2025
    
    Dear Editor,
    
    we apologize to provide the wrong code link, the code is available in zenodo under: https://doi.org/10.5281/zenodo.16629039. We also added the version numbers of the library requirements to run the code (requirements.txt).
    
    Thank you for the correction and Best regards 
    
    Michael Aich
    
    Citation: https://doi.org/10.5194/egusphere-2025-2646-AC2
RC1:
'Comment on egusphere-2025-2646', Anonymous Referee #1, 02 Sep 2025

This manuscript presents a conditional diffusion model framework for simultaneous bias correction and downscaling of Earth System Model (ESM) precipitation fields. The novelty lies in training the model exclusively on observational data by mapping both ESM and observations into a shared embedding space, where quantile mapping and noise injection help align distributions. The conditional diffusion model then reconstructs small-scale precipitation structures while preserving large-scale ESM patterns. The authors evaluate the method using ERA5 as the observational reference and GFDL-ESM4 as the test ESM, showing improvements over bilinear interpolation with quantile mapping and comparisons with other diffusion approaches. They also highlight strengths in representing extremes, ensemble spread, and future climate scenario preservation.
Major Issues

1. The experimental setup is confined to one ESM (GFDL-ESM4) and one reanalysis dataset (ERA5) over a single continental region (South America). While the framework claims generality to “any ESM,” the evidence is narrow. Without testing multiple models or regions, it is unclear whether the embedding and conditional framework is robust to diverse ESM biases and precipitation regimes. Furthermore, the noise-scale hyperparameter, chosen at the spectral intersection, is dataset-specific and may require fine-tuning across contexts, raising concerns about general applicability.
2. The choice of benchmark—bilinear upsampling followed by quantile mapping (QM)—is somewhat too weak given recent literature. QM is indeed the statistical baseline, but the field has seen GAN-based approaches (cycleGANs, conditional GANs), CNN-based super-resolution, and unconditional consistency models that have been applied to similar downscaling tasks. Although the authors briefly compare with Hess et al. (2025) and an EDM model, the evaluation remains limited and not systematic. A stronger study would include comparisons against multiple state-of-the-art baselines (GAN, VAE, transformer- or CNN-based super-resolution methods) under consistent experimental conditions.
3. The manuscript does not clearly articulate how the proposed model differs from existing diffusion-based downscaling and bias correction efforts. For example, Wan et al. (2024) combined diffusion with optimal transport, while EDM (Karras et al., 2022) provides another diffusion benchmark. The authors claim advantages in efficiency and data efficiency, but the conceptual distinction between their conditional embedding approach and these prior diffusion frameworks is not fully elaborated. Is the main novelty the embedding trick with QM + noise to align distributions? Or is it the conditional supervision on observational embeddings? This needs further discussion.
4. While the results on extremes (R95p, Rx1Day) and SSP5-8.5 trends are promising, the metrics are limited. Extreme event validation could be broadened with tail-focused skill scores, quantile-specific errors, or return-level analyses. For future scenarios, the manuscript shows preservation of mean and trend, but it remains unclear whether the method could distort physical consistency (e.g., covariance with other variables, conservation constraints). Since diffusion models are inherently stochastic, an evaluation of physical realism constraints would be useful.
5. A central claim is that the proposed method is independent of the chosen ESM because the diffusion model is trained only on observations. However, in practice, the embedding transformation g requires quantile mapping of ESMs, which is itself model-dependent. Thus, some degree of ESM-specific adjustment is unavoidable. The manuscript should acknowledge this limitation and discuss how sensitive results are to the chosen reference period, quantile mapping scheme, and observational dataset.
Recommendation
The manuscript introduces a promising and technically creative approach that leverages conditional diffusion for a challenging problem in climate modeling. However, the current version has limitations in experimental breadth, benchmark rigor, and clarity of novelty relative to existing diffusion approaches. I recommend major revision before publication. The authors should expand the benchmark comparison, better articulate how their method diverges from and improves upon existing diffusion-based methods, and provide more robust multi-model/multi-region evaluations to strengthen the claim of general applicability.

Citation: https://doi.org/10.5194/egusphere-2025-2646-RC1
- AC4: 'Reply on RC1', Michael Aich, 25 Oct 2025
  
  Publisher’s note: the supplement to this comment was edited on 30 October 2025. The adjustments were minor without effect on the scientific meaning.
  
  We thank the Reviewer for their constructive comments. Please find our detailed point-by-point response in the attached supplement 'response_to_reviewers.pdf'.
  
  Citation: https://doi.org/10.5194/egusphere-2025-2646-AC4
RC2:
'Comment on egusphere-2025-2646', Anonymous Referee #2, 11 Sep 2025

This manuscript proposes a framework to downscale GFDL-ESM4 using a conditional diffusion model trained only on ERA5. The authors align the train/test distributions by (i) applying quantile mapping (QM) to the ESM data to remove large-scale biases and (ii) adding carefully chosen noise so that both ERA5 and GFDL are projected into a shared embedding on which the conditional diffusion model is trained and applied.
What I like
- Focusing on precipitation is well motivated; it remains one of the hardest fields for ML downscaling and bias correction.

- Within the scope of their data, the authors conduct a relatively deep analysis and explore key hyperparameters. The SI is useful for additional insights

- The idea to select a noise (cutoff) scale via the PSD relationship between ERA5 and GFDL seems new; the model then matches small-scale (high-wavenumber) PSD to ERA5 while preserving large-scale ESM information. This design appears effective based on PSDs, trend preservation, and overall fidelity.
Clarify the embedding vs. preprocessing story
- Early in the paper I read the approach as latent-space manipulation after mapping both datasets into a shared space. Later it became clear that the shared embedding is achieved via preprocessing (QM + controlled noising), and the diffusion model learns to reverse the noising conditioned on the preserved large scales.

- This sequencing is a bit confusing and contributes to statements about “no dependency on the test dataset” and “no ESM - OBS pairing” being misread.

- Concretely: this is supervised training on ERA5 and inference on preprocessed GFDL that has been mapped into the same embedding. I recommend making that pipeline explicit with a schematic and a sentence like we preprocess ERA5 and GFDL to a shared embedding (via QM + noising up to scale s); we train on ERA5 in this embedding and apply the learned conditional reverse process to embedded GFDL at inference.

Reference line in the paper: "We map observational and ESM data to a shared embedding space, where both are unbiased towards each other and train a conditional diffusion model to reverse the mapping."
Quantile mapping and potential leakage
- Please specify exactly how and when QM is fit and applied, if QM parameters are estimated using years that later appear in validation, or worse, from the future period, there is a risk of data leakage and trend distortion.
Scope: broaden temporal and regional tests
- The analysis is relatively deep but narrow in scope.

- Extend to at least one additional region with different regimes.

- Use a longer temporal validation, including seasonality coupled with temporal behavior (autocorrelations, wet/dry spell durations, event persistence), and spatial/temporal scatter comparisons between train (ERA5) and test (GFDL-embedded) across the full-time span.
Choice of the cutoff scale s
- You present s as the PSD-based crossover that yields strong performance. That is reasonable.

- Compare to alternative mechanisms (e.g., providing a noise channel explicitly to a strong CNN baseline, or conditioning variants in diffusion that target high-frequency losses).
Baselines and diversity
- The test set lacks model diversity (single ESM), and the baselines are limited.

- Add at least one more ESM with different small-scale biases.

- Add strong ML baselines (e.g., diffusion/SR variants trained on down/upsampled ERA5 pairs, competitive CNN/Transformer SR models) alongside standard statistical methods.
Transformations and ablations
- Data undergo heavy transformations (log, scaling, etc.). Please include ablations on these choices and demonstrate their effects.
Figures
- Use a good and uniform color map and clearer legend/contrast for Figure 3.

- In sample 4 (upper-left corner), precipitation patterns appear to change; please comment on whether this is intended regeneration of small-scale structure consistent with large scales, or an artifact.
Inputs, reproducibility, and generalizability
- What are the input channels to the model (precip only, or multivariate conditioning such as humidity, winds, temperature)? Please list them explicitly.

- Provide exact config files/scripts (including how s is computed from two PSDs) to ensure full reproducibility.

- Clearly mention that new ESM will require training the model again due to precprocessing - it seems to be missing the in the text.

I recommend major revision. The approach is promising and potentially impactful, but the current version requires broader validation, clearer methodological framing, and stronger baselines. I would be happy to review a revised manuscript if the authors choose to resubmit.

Citation: https://doi.org/10.5194/egusphere-2025-2646-RC2
- AC3: 'Reply on RC2', Michael Aich, 25 Oct 2025
  
  Publisher’s note: the supplement to this comment was edited on 30 October 2025. The adjustments were minor without effect on the scientific meaning.
  We thank the Reviewer for their constructive comments. Please find our detailed point-by-point response in the attached supplement 'response_to_reviewers.pdf'.
  
  Citation: https://doi.org/10.5194/egusphere-2025-2646-AC3

Michael Aich, Philipp Hess, Baoxiang Pan, Sebastian Bathiany, Yu Huang, and Niklas Boers

Supplement

https://doi.org/10.5194/egusphere-2025-2646-supplement

Michael Aich, Philipp Hess, Baoxiang Pan, Sebastian Bathiany, Yu Huang, and Niklas Boers

Viewed

Total article views: 2,272 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
2,132	100	40	2,272	58	36	41

HTML: 2,132
PDF: 100
XML: 40
Total: 2,272
Supplement: 58
BibTeX: 36
EndNote: 41

Views and downloads (calculated since 27 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	75	6	2	83
Jul 2025	125	18	17	160
Aug 2025	362	13	2	377
Sep 2025	1,323	16	6	1,345
Oct 2025	135	15	5	155
Nov 2025	101	25	8	134
Dec 2025	11	7	0	18

Cumulative views and downloads (calculated since 27 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	75	6	2	83
Jul 2025	125	18	17	160
Aug 2025	362	13	2	377
Sep 2025	1,323	16	6	1,345
Oct 2025	135	15	5	155
Nov 2025	101	25	8	134
Dec 2025	11	7	0	18

Viewed (geographical distribution)

Total article views: 2,108 (including HTML, PDF, and XML) Thereof 2,108 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 06 Dec 2025

Short summary

Accurately simulating rainfall is essential to understand the impacts of climate change, especially extreme events such as floods and droughts. Climate models simulate the atmosphere at a coarse resolution and often misrepresent precipitation, leading to biased and overly smooth fields. We improve the precipitation using a machine learning model that is data-efficient, preserves key climate signals such as trends and variability, and significantly improves the representation of extreme events.


Total:	0
HTML:	0
PDF:	0
XML:	0