Applying deep learning to a chemistry-climate model for improved ozone prediction

Liu, Zhenze; Li, Ke; Wild, Oliver; Doherty, Ruth M.; O’Connor, Fiona M.; Turnock, Steven T.

doi:10.5194/egusphere-2025-1250

Preprints

https://doi.org/10.5194/egusphere-2025-1250

Preprints

10 Jun 2025

| 10 Jun 2025

Applying deep learning to a chemistry-climate model for improved ozone prediction

Zhenze Liu, Ke Li, Oliver Wild, Ruth M. Doherty, Fiona M. O’Connor, and Steven T. Turnock

Abstract. Chemistry-climate models have developed significantly over the decades, yet they still exhibit substantial systematic biases in simulating atmospheric composition due to gaps in our understanding of underlying processes. Building on deep learning’s success in different domains, we explore its application to correct surface ozone biases in the state-of-the-art chemistry-climate model UKESM1. Six statistical models have been developed, and the model Transformer outperforms others due to its advanced architecture. A simple weighted ensemble approach is further proved to enhance performance by 14 % over the best single model Transformer, reducing RMSE to 0.69 ppb. Applied to future scenarios (SSP3-7.0 and SSP3-7.0-lowNTCF), the UKESM1 shows a larger overestimation of ozone changes by up to 25 ppb compared to present-day conditions. Despite biases, UKESM1 captures the non-linear ozone sensitivity to precursors, with temperature-sensitive processes identified as a dominant contributor to biases. We highlight that simulations of future surface ozone are likely to become less accurate under a warmer climate. Therefore, the bias correction approaches introduced here have substantial potential to improve the accuracy of ozone impact assessments. These methods are also applicable to other chemistry-climate models, which is critical for informing air quality and climate policy decisions.

Received: 17 Mar 2025 – Discussion started: 10 Jun 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Zhenze Liu, Ke Li, Oliver Wild, Ruth M. Doherty, Fiona M. O’Connor, and Steven T. Turnock

Status: closed

RC1: 'Comment on egusphere-2025-1250', Anonymous Referee #2, 21 Jul 2025

Liu et al. (2025) uses 6 different statistical models to bias correct surface UKESM1 ozone comparing to CAMS reanalysis. A weighted approach is shown to improve performance over any single model. This bias correction is then applied to future scenarios, specifically SSP3-7.0 and SSP3-7.0-lowNTCF. The manuscript is well written and the figures are broadly of good quality.
Major concerns
My major concern with this study is the assumptions behind it and the validity of the bias correction for future scenarios. The authors state "We assume that UKESM1 exhibits systematic biases that are associated with other self-generated variables" (L97), and UKESM biases are then corrected by comparing UKESM1 to CAMS. However, there was only a limited discussion of CAMS, and while CAMS shows reduced biases compared to TOAR observations there were no detailed comments on whether CAMS uses the same emissions as UKESM or how emissions and how these are represented in models could influence biases against observations. How much is CAMS constrained by data assimilation compared to a free-running model such as UKESM, and is correcting ozone for e.g. biases in temperature a valid and fair approach? I feel that more reasoning is required here, along with clear caveats on the approach taken.
I am currently also unconvinced that bias-correcting to CAMS for the present day, then applying this bias correction to future scenarios is also valid and fair. How sure are we that both UKESM and CAMS have the correct internal relationships in the present day simulations to be sure that projecting this bias correction into the future, with a different climate state and different emissions, would lead to that bias correction still being valid? Again, I feel that a greater discussion here of the validity of this method and the caveats in doing so should be made, especially as quite strong statements are made assuming that this is an entirely valid approach. The authors state that "This indicates that the UKESM1 has a greater sensitivity of seasonal O3 changes due to unknown reasons" (L149) in the discussion of future surface O3 changes, but these unknown reasons may make this approach less certain to succeed. How sure are the authors that they have the correct assumptions in the calculation of the biases in the present day, and is UKESM the correct model to consider regional air pollution in the context of their wider questions?
Although at the end of the paper, the authors do state that "we acknowledge that uncertainties remain, particularly regarding the use of CAMS data as a reference for model training" (L262-3) I would have preferred a much more detailed discussion of the assumptions and limitations of this study, as from the current manuscript I am not convinced that this is a valid approach. In terms of a technical peice of work it is well formulated and presented, but scientifically I am nervous about the strength of the scientific statements that the authors make, particularly around the size of the biases UKESM may simulate when considering future climates.
Specific issues
Figure 3 - why do there seem to be discontinuities at 0.01, 0.1, and 1? The behaviour of the curve seems to jump at each of these values of sigma.
Figure 4 - I needed to zoom in quite a lot to be able to see the detail of the hatching. I would recommend making this plot bigger, perhaps a full-page 6x2 rather than the 3x4 currently presented.
Figure 6 - given that the error bars (only 1 standard deviation) in winter mostly all straddle 0, can it be said that there are any biases in that season at all from this method?

Citation: https://doi.org/10.5194/egusphere-2025-1250-RC1
RC2: 'Comment on egusphere-2025-1250', Anonymous Referee #1, 12 Aug 2025

Surface ozone (O3) as simulated in CMIP6 models, like UKCA/UKESM, tend to show large biases against observations (e.g., Turnock et al., 2020). In particular these models tend to simulate too much O3 in the Northern Hemisphere (NH) summer months (JJA) and too little in the NH winter months (DJF). The reasons behind these are not completely understood but likely reflect biases in the models climate, chemistry and emissions (obviously).

Liu et al use a range of techniques -- from very simple statistical techniques to increasingly complex machine learning (ML) ones -- to bias correct UKCA modelled surface O3 during the historic period using "observations" from the Copernicus Atmosphere Monitoring Service (CAMS) reanalysis. CAMS itself is not perfect in this regard as it is model data but does provide globally gridded data at a comparable resolution to the UKCA outputs.

They find that the Transformer (an attention based ML model) performs best and multiple linear regression performs worst. They pool the results of their bias correction models using an ensembling technique and then then use their ensemble bias correction model to bias correct future surface O3 projections. Using this they find that the UKCA global mean surface O3 in JJA under the SSP-370 is overestimated by more than 12 ppb.

I think the results are interesting, but not surprising, the methodology is clear to follow, but not state-of-the-art, and overall this is a well written and thought provoking study. I have my reservations on the reliability of the results but I think this is a publishable study.

Main concerns:

1) Nudging -- as I understand it the model simulations are not nudged and as a result one source of the O3 disagreement in UKCA is from the incorrect simulation of climate. Is this true? I think it would be worth making this clearer. Temperature pops out as an important feature and presumably bias correcting for temperature still does nothing for the O3 bias (as shown before in Archibald et al. 2020).

2) Links to bias correction literature -- on my reading it seems like there is little to no attempt to discuss and place this work in the context of the very broad literature on bias correction, especially for climate models. I would like to know how techniques such as those by Ayar et al. (2021), Vrac and Friederichs (2015) or Nivron et al. (2025) complement, challenge or reinforce the choices of the approach of the task of bias correction taken here.

3) CAMS data -- I think the caveat of the use of these data needs to be made more clearly and centrally.

More general:

It is stated that you use 20 features in training your models but you don't articulate exactly what these are. Please can you add a table.

Figure 7 is interesting and the central point about the shift in NOx/VOC emissions regimes changing is clear but could you map out the areas in space where these points are coming from is to provide some spatial context? I am struggling to see how there is only ever one O3 value for a given NOx/VOC ratio (and is this concentrations or emissions? If so be clear on if it's g(N)/g(C) or what the units are).

References:
Vaittinada Ayar, P., Vrac, M. & Mailhot, A. Ensemble bias correction of climate simulations: preserving internal variability. Sci Rep 11, 3098 (2021). https://doi.org/10.1038/s41598-021-82715-1
Nivron et al. 2025. A temporal stochastic bias correction using a machine learning attention model. doi:10.1017/eds.2024.42
Vrac, M., and P. Friederichs, 2015: Multivariate—Intervariable, Spatial, and Temporal—Bias Correction. J. Climate, 28, 218–237, https://doi.org/10.1175/JCLI-D-14-00059.1.

Citation: https://doi.org/10.5194/egusphere-2025-1250-RC2
AC1: 'Response to reviewers comments', Zhenze Liu, 23 Sep 2025

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1250/egusphere-2025-1250-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2025-1250-AC1

Status: closed

RC1: 'Comment on egusphere-2025-1250', Anonymous Referee #2, 21 Jul 2025

Liu et al. (2025) uses 6 different statistical models to bias correct surface UKESM1 ozone comparing to CAMS reanalysis. A weighted approach is shown to improve performance over any single model. This bias correction is then applied to future scenarios, specifically SSP3-7.0 and SSP3-7.0-lowNTCF. The manuscript is well written and the figures are broadly of good quality.
Major concerns
My major concern with this study is the assumptions behind it and the validity of the bias correction for future scenarios. The authors state "We assume that UKESM1 exhibits systematic biases that are associated with other self-generated variables" (L97), and UKESM biases are then corrected by comparing UKESM1 to CAMS. However, there was only a limited discussion of CAMS, and while CAMS shows reduced biases compared to TOAR observations there were no detailed comments on whether CAMS uses the same emissions as UKESM or how emissions and how these are represented in models could influence biases against observations. How much is CAMS constrained by data assimilation compared to a free-running model such as UKESM, and is correcting ozone for e.g. biases in temperature a valid and fair approach? I feel that more reasoning is required here, along with clear caveats on the approach taken.
I am currently also unconvinced that bias-correcting to CAMS for the present day, then applying this bias correction to future scenarios is also valid and fair. How sure are we that both UKESM and CAMS have the correct internal relationships in the present day simulations to be sure that projecting this bias correction into the future, with a different climate state and different emissions, would lead to that bias correction still being valid? Again, I feel that a greater discussion here of the validity of this method and the caveats in doing so should be made, especially as quite strong statements are made assuming that this is an entirely valid approach. The authors state that "This indicates that the UKESM1 has a greater sensitivity of seasonal O3 changes due to unknown reasons" (L149) in the discussion of future surface O3 changes, but these unknown reasons may make this approach less certain to succeed. How sure are the authors that they have the correct assumptions in the calculation of the biases in the present day, and is UKESM the correct model to consider regional air pollution in the context of their wider questions?
Although at the end of the paper, the authors do state that "we acknowledge that uncertainties remain, particularly regarding the use of CAMS data as a reference for model training" (L262-3) I would have preferred a much more detailed discussion of the assumptions and limitations of this study, as from the current manuscript I am not convinced that this is a valid approach. In terms of a technical peice of work it is well formulated and presented, but scientifically I am nervous about the strength of the scientific statements that the authors make, particularly around the size of the biases UKESM may simulate when considering future climates.
Specific issues
Figure 3 - why do there seem to be discontinuities at 0.01, 0.1, and 1? The behaviour of the curve seems to jump at each of these values of sigma.
Figure 4 - I needed to zoom in quite a lot to be able to see the detail of the hatching. I would recommend making this plot bigger, perhaps a full-page 6x2 rather than the 3x4 currently presented.
Figure 6 - given that the error bars (only 1 standard deviation) in winter mostly all straddle 0, can it be said that there are any biases in that season at all from this method?

Citation: https://doi.org/10.5194/egusphere-2025-1250-RC1
RC2: 'Comment on egusphere-2025-1250', Anonymous Referee #1, 12 Aug 2025

Surface ozone (O3) as simulated in CMIP6 models, like UKCA/UKESM, tend to show large biases against observations (e.g., Turnock et al., 2020). In particular these models tend to simulate too much O3 in the Northern Hemisphere (NH) summer months (JJA) and too little in the NH winter months (DJF). The reasons behind these are not completely understood but likely reflect biases in the models climate, chemistry and emissions (obviously).

Liu et al use a range of techniques -- from very simple statistical techniques to increasingly complex machine learning (ML) ones -- to bias correct UKCA modelled surface O3 during the historic period using "observations" from the Copernicus Atmosphere Monitoring Service (CAMS) reanalysis. CAMS itself is not perfect in this regard as it is model data but does provide globally gridded data at a comparable resolution to the UKCA outputs.

They find that the Transformer (an attention based ML model) performs best and multiple linear regression performs worst. They pool the results of their bias correction models using an ensembling technique and then then use their ensemble bias correction model to bias correct future surface O3 projections. Using this they find that the UKCA global mean surface O3 in JJA under the SSP-370 is overestimated by more than 12 ppb.

I think the results are interesting, but not surprising, the methodology is clear to follow, but not state-of-the-art, and overall this is a well written and thought provoking study. I have my reservations on the reliability of the results but I think this is a publishable study.

Main concerns:

1) Nudging -- as I understand it the model simulations are not nudged and as a result one source of the O3 disagreement in UKCA is from the incorrect simulation of climate. Is this true? I think it would be worth making this clearer. Temperature pops out as an important feature and presumably bias correcting for temperature still does nothing for the O3 bias (as shown before in Archibald et al. 2020).

2) Links to bias correction literature -- on my reading it seems like there is little to no attempt to discuss and place this work in the context of the very broad literature on bias correction, especially for climate models. I would like to know how techniques such as those by Ayar et al. (2021), Vrac and Friederichs (2015) or Nivron et al. (2025) complement, challenge or reinforce the choices of the approach of the task of bias correction taken here.

3) CAMS data -- I think the caveat of the use of these data needs to be made more clearly and centrally.

More general:

It is stated that you use 20 features in training your models but you don't articulate exactly what these are. Please can you add a table.

Figure 7 is interesting and the central point about the shift in NOx/VOC emissions regimes changing is clear but could you map out the areas in space where these points are coming from is to provide some spatial context? I am struggling to see how there is only ever one O3 value for a given NOx/VOC ratio (and is this concentrations or emissions? If so be clear on if it's g(N)/g(C) or what the units are).

References:
Vaittinada Ayar, P., Vrac, M. & Mailhot, A. Ensemble bias correction of climate simulations: preserving internal variability. Sci Rep 11, 3098 (2021). https://doi.org/10.1038/s41598-021-82715-1
Nivron et al. 2025. A temporal stochastic bias correction using a machine learning attention model. doi:10.1017/eds.2024.42
Vrac, M., and P. Friederichs, 2015: Multivariate—Intervariable, Spatial, and Temporal—Bias Correction. J. Climate, 28, 218–237, https://doi.org/10.1175/JCLI-D-14-00059.1.

Citation: https://doi.org/10.5194/egusphere-2025-1250-RC2
AC1: 'Response to reviewers comments', Zhenze Liu, 23 Sep 2025

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1250/egusphere-2025-1250-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2025-1250-AC1

Zhenze Liu, Ke Li, Oliver Wild, Ruth M. Doherty, Fiona M. O’Connor, and Steven T. Turnock

Viewed

Total article views: 1,145 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,020	99	26	1,145	25	39

HTML: 1,020
PDF: 99
XML: 26
Total: 1,145
BibTeX: 25
EndNote: 39

Views and downloads (calculated since 10 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	180	38	11	229
Jul 2025	77	19	5	101
Aug 2025	179	7	0	186
Sep 2025	451	11	3	465
Oct 2025	71	12	1	84
Nov 2025	62	12	6	80

Cumulative views and downloads (calculated since 10 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	180	38	11	229
Jul 2025	77	19	5	101
Aug 2025	179	7	0	186
Sep 2025	451	11	3	465
Oct 2025	71	12	1	84
Nov 2025	62	12	6	80

Viewed (geographical distribution)

Total article views: 1,108 (including HTML, PDF, and XML) Thereof 1,108 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 24 Nov 2025

Short summary

Our research aimed to enhance predictions of ozone levels in the atmosphere, a gas that influences air quality and climate. We used a computer model called UKESM1 to simulate ozone, but its estimates were often inaccurate. By applying deep learning, we improved the accuracy of these predictions. This advance helps us understand how ozone might shift as the climate warms. Better predictions are vital for shaping policies on air quality and climate.


Total:	0
HTML:	0
PDF:	0
XML:	0