An Online Spectral Nudging-Based Correction System: Improving Physical Model Forecasts by Incorporating Large-Scale Circulations Derived from Machine Learning Models

Su, Yong; Wang, Jincheng; Shen, Xueshun; Liu, Couhua; Li, Xingliang; Jing, Hao; Zhang, Jin; Hu, Yingying

doi:10.5194/egusphere-2026-396

Preprints

https://doi.org/10.5194/egusphere-2026-396

Preprints

03 Mar 2026

| 03 Mar 2026

Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

An Online Spectral Nudging-Based Correction System: Improving Physical Model Forecasts by Incorporating Large-Scale Circulations Derived from Machine Learning Models

Yong Su, Jincheng Wang, Xueshun Shen, Couhua Liu, Xingliang Li, Hao Jing, Jin Zhang, and Yingying Hu

Abstract. Traditional numerical weather prediction (NWP) models are constrained by limitations in the representation of physical processes and computational resources, resulting in lengthy development cycles and relatively slow improvements in forecast skill. In recent years, machine learning (ML)-based weather forecasting models have advanced rapidly, and in some aspects, outperform traditional physical models, particularly in forecasting large-scale circulation. However, these ML-based models suffer from notable deficiencies, such as over-smoothing in forecasts and inadequate capability for predicting extreme weather events. In this study, an online correction system based on the spectral nudging (SN) method is developed. In this system, the China Meteorological Administration Global Forecast System (CMA-GFS) is used as the foundational physical model, and a correction term is integrated into the governing equations, such that during numerical integration, the large-scale circulation is constrained to evolve toward the forecasts produced by the ML model FuXi. The performance of the hybrid system on large-scale circulation prediction is comparable to that of the FuXi model, with a substantial extension of forecast leading time and a marked improvement in the stability of forecast skill. Verification against high-impact weather events, including heavy rainfall and tropical cyclones, demonstrates that the hybrid system integrates the strengths of the FuXi model in forecasting circulation patterns, precipitation distribution and tropical cyclone tracks, while preserving the advantages of the CMA-GFS in representing precipitation intensity, tropical cyclone intensity and fine-scale details. Thus, the system demonstrates robust forecasting capability for extreme weather. This proof-of-concept study verifies that the SN-based method can effectively integrate the complementary strengths of ML and physical models, providing a new pathway for the operational NWP.

Received: 23 Jan 2026 – Discussion started: 03 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Yong Su, Jincheng Wang, Xueshun Shen, Couhua Liu, Xingliang Li, Hao Jing, Jin Zhang, and Yingying Hu

Status: open (until 28 Apr 2026)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2026-396', Anonymous Referee #1, 29 Mar 2026 reply
The paper describes research tests of a system to nudge the CMA-GFS physical model with the FuXi machine learning model to improve the large-scale evolution of the physical model whilst retaining its benefits for small scale detail.
Fundamentally, the methodology is very similar to the referenced Husain et al. (2025) paper, but using different physical and ML models. It shares the same limitations in terms of the coarse vertical resolution of the output from the ML model and inconsistent analyses between the physical and ML models. As the authors note, these limitations were addressed by Polichtchouk et al. (2024) who used an ML model with much higher vertical resolution to gain considerably improved results from nudging, especially in the lower troposphere. The authors of the present paper do outline their plans to address these limitations in their discussion section.
Hence, this paper doesn’t necessarily advance the science, however it does document the repeated test of a published method (an important aspect of science) and some common similarities in results are obtained using different models, which is useful for other centers considering using the nudging approach. The paper is clear and well written and, therefore, I believe this paper is a useful addition to the literature and should be published in EGUsphere.
Minor comments:
Line 18 – suggest replacing ‘foundational’ with ‘underpinning’ to avoid any confusion with foundational AI models.

The authors cite a weakness of ML models as being “Progressive smoothing in long-range forecasts”. Whilst this can be the case if the target is RMSE, for which a smoother field can lead to a better score to avoid a ‘double penalty’ from positional errors, ML weather models do now exist which avoid this smoothing by not (solely) minimising on RMSE, such as the AIFS-CRPS. This should be acknowledged in the paper.

Line 149 and subsequent use of ‘typhoon’. Where the reference is not specifically to the Pacific basin, the more generic term of ‘tropical cyclone’ should be used.

Line 296 – could the issues with nudging at smaller scales than T21 also be due to the poor vertical resolution of FuXi output and lack of nudging in the lower troposphere? Where centres have tried nudging to the model level AIFS, improved performance has been found to scales of T63 without the issues documented here.

Figure 4 – suggest making it clear that the “gridded merged precipitation product of the CMA” is observationally based and also add what sources are merged (gauge?, satellite(?), radar(?)).

Figures 8&9 – what is the bias with respect to. Is it own analysis, all compared with ERA5 or something else?

I assume only deterministic models are used here. Throughout the paper, the framing is in terms of number of days of skilful prediction. The use of 0.6 on ACC is widely used, but fairly arbitrary. Ensembles provide the most useful forecast information, even when the skill of deterministic model is relatively high. Do the authors have plans to incorporate the spectral nudging into CMA’s ensemble prediction system? Understanding what barriers would need to be overcome to achieve this would be a valuable addition to the discussion section.

Reply
Citation: https://doi.org/10.5194/egusphere-2026-396-RC1
CC1:
'Comment on egusphere-2026-396', Yi Yang, 30 Mar 2026 reply
This study holds significant scientific importance and application value. By integrating large-scale information extracted from machine learning models to optimize the physics-driven model, it significantly improves its accuracy and generalization capability. The writing is of high quality, and the structure is well-organized and easy to follow. However, I believe the study still needs to further strengthen its emphasis on its key strengths.
Major comments
Introduction: The authors present a relatively comprehensive discussion of physics-driven models and machine learning models. However, two points warrant attention:
The authors note that the development of physics-driven models is primarily constrained by limitations in the representation of physical processes. While this statement is technically accurate, it is worth noting that the key advantage of physics-driven models over machine learning models lies in their physical interpretability. Therefore, in the context of this study, this point may be reconsidered or omitted.

As noted by the authors, several studies have already improved global forecasts by extracting large-scale circulations from machine learning models (Lines 141-152). However, in the following paragraph, the authors directly introduce the online correction system based on the nudging method, which I find somewhat confusing. In my view, it would be beneficial for the authors to first summarize the current state of research and clearly articulate the existing problems—namely, the motivation for conducting this study. This would help better highlight the importance of the research.

The authors state that the FuXi model is driven by ERA5 reanalysis data to produce forecast fields, which then supply large-scale circulations to the physical model. I have a concern: given that reanalysis data are generally not accessible in real time for operational applications, how feasible is this method in practice?
Minor comments
Line 148: Why is “truncation wavenumber” particularly noted here, unless it has special significance?
Lines 262-278: the truncation wavenumber is determined mainly based on the KES differences between the CMA-GFS and FuXi models, illustrated using forecasts from four initialization dates. Given that this selection (42 instead of 21) is derived from a limited set of cases, I am concerned about its representativeness and robustness.
Line 353: For the case study, are there any quantitative comparative results available?

Reply
Citation: https://doi.org/10.5194/egusphere-2026-396-CC1
RC2: 'Comment on egusphere-2026-396', Anonymous Referee #2, 08 Apr 2026 reply

This paper proposes an online correction framework that integrates a machine learning model (FuXi) with a physical numerical weather prediction model (CMA-GFS) through spectral nudging.
Overall, I find this work valuable as a careful and useful validation of an existing methodology. In particular, it demonstrates that the hybrid nudging framework can be successfully implemented within a different model system, which may be helpful for operational centers considering similar approaches. The manuscript is also clearly written and provides a detailed description of the workflow, which makes it easy to follow.
From a scientific perspective, however, the main contribution appears to be a system-specific implementation of an already established paradigm rather than a fundamentally new methodological development. The approach is largely consistent with prior work such as Husain et al. (2025), including scale-selective spectral nudging, handling of coarse vertical ML outputs, and the use of vertical weighting to mitigate inconsistencies. While this is not a limitation in itself, it may be helpful for the authors to more clearly position their work relative to these studies and clarify whether there are specific aspects in which their implementation provides advantages.
One aspect that could benefit from further clarification is the preprocessing step used to map FuXi outputs from 13 pressure levels to the 87 model levels of CMA-GFS. This step is only briefly mentioned and not specified in detail. Since vertical interpolation is a critical component that can significantly influence the representation of atmospheric structure (especially gradients, stability, and boundary-layer processes), the lack of description raises concerns about reproducibility and scientific validity. It is unclear what interpolation scheme is used, how physical consistency is preserved, and to what extent this preprocessing step may introduce biases or damp important features before nudging is even applied.
In the current work, the FuXi outputs are interpolated to the CMA-GFS vertical grid through this unspecified preprocessing step, and the resulting inconsistency is mitigated by applying a vertically varying nudging coefficient that limits the correction primarily to the mid–upper troposphere. However, this approach closely follows that of Husain et al. (2025), who employed a similar vertical weighting strategy to address the same issue. As such, it is unclear what methodological innovation is introduced here beyond adopting an existing workaround.
At the same time, the manuscript acknowledges alternative approaches, such as Polichtchouk et al. (2024), who address this limitation more fundamentally by increasing the vertical resolution of the ML model (e.g., 137 levels), thereby reducing the need for ad hoc vertical weighting. Given this, it would be important for the authors to clarify why a similar strategy is not adopted in the present study. Is the choice driven by computational constraints, data availability, or compatibility with FuXi? Without such justification, the current approach appears as a pragmatic but potentially suboptimal solution rather than a deliberate methodological design.
Regarding presentation, the introduction provides a broad survey of ML-based weather models, but the connection to the core contribution is not clearly articulated. The cited models appear more as a catalogue than as elements that directly motivate the proposed method. Given that the main idea can be summarized concisely — combining ML-derived large-scale circulation with physics-based small-scale consistency via spectral nudging — the introduction could be significantly streamlined. More generally, this issue extends beyond the introduction to the entire manuscript. While the detailed narrative of the research process is informative, the paper would benefit from a more concise and structured presentation. In particular, lengthy descriptive explanations of intermediate attempts or design choices could be reduced, and key ideas could instead be conveyed more effectively through tables, figures, or mathematical formulations.
In addition, the paper does not sufficiently justify why spectral nudging is appropriate in this global modeling context. The evaluation is also limited to internal comparisons within a single modeling framework, without benchmarking against other state-of-the-art ML or hybrid systems, including closely related work such as Husain et al. (2025). This makes it difficult to assess the broader competitiveness or generality of the approach.

Reply

Citation: https://doi.org/10.5194/egusphere-2026-396-RC2

Yong Su, Jincheng Wang, Xueshun Shen, Couhua Liu, Xingliang Li, Hao Jing, Jin Zhang, and Yingying Hu

Viewed

Total article views: 280 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
192	73	15	280	10	13

HTML: 192
PDF: 73
XML: 15
Total: 280
BibTeX: 10
EndNote: 13

Views and downloads (calculated since 03 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	177	58	14	249
Apr 2026	15	15	1	31

Cumulative views and downloads (calculated since 03 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	177	58	14	249
Apr 2026	15	15	1	31

Viewed (geographical distribution)

Total article views: 271 (including HTML, PDF, and XML) Thereof 271 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 13 Apr 2026

Short summary

The traditional weather prediction models improve slowly, while machine learning models struggle with extreme weather and fine details. To address these gaps, we developed an online correction system that leverages a machine learning model's skillful large-scale circulation to guide a physical model. This hybrid model enhances large-scale skill while preserving small-scale features, providing a viable pathway for improving operational weather forecasting.


Total:	0
HTML:	0
PDF:	0
XML:	0