Improvement of the Computational Efficiency in SVD-3DEnVar Data Assimilation Scheme and Its Preliminary Application to the TRAMS 3.0 Model

Liu, Kun; Xu, Daosheng; Zheng, Fei; He, Juanxiong; Li, Chun; Leung, Jeremy Cheuk-Hin; Zhang, Mingyang; Zhao, Dingchi; He, Quanjun; Zhang, Yuewei; Li, Yi; Zhang, Banglin

doi:10.5194/egusphere-2025-4632

Preprints

https://doi.org/10.5194/egusphere-2025-4632

Preprints

14 Nov 2025

| 14 Nov 2025

Improvement of the Computational Efficiency in SVD-3DEnVar Data Assimilation Scheme and Its Preliminary Application to the TRAMS 3.0 Model

Kun Liu, Daosheng Xu, Fei Zheng, Juanxiong He, Chun Li, Jeremy Cheuk-Hin Leung, Mingyang Zhang, Dingchi Zhao, Quanjun He, Yuewei Zhang, Yi Li, and Banglin Zhang

Abstract. Although the Singular Value Decomposition-three Dimensional Ensemble Variational (SVD-3DEnVar) data assimilation scheme has achieved successful application in real case simulations with comprehensive numerical weather prediction models, its computational efficiency still cannot meet the demands of actual operational numerical forecasting. The main limitations lie in the generation of three-dimensional perturbations and the implementation of parallel calculations. This paper constructed a three-dimensional perturbation field generation scheme that supports multi-process parallelism and can directly generate any specified number of grid points in both horizontal and vertical directions. At the same time, an efficient parallel implementation scheme has been developed according to the characteristics of local patch assimilation in the SVD-3DEnVar scheme. The Observing System Simulation Experiment (OSSE) test results based on the Tropical Regional Atmospheric Model System (TRAMS) show that after computational efficiency optimization, the time required to generate a 3D perturbation field has been reduced from 22 minutes to 2.2 seconds, while the runtime of the assimilation process has decreased from 1,700 minutes under serial execution to less than 15 minutes (using 150 nodes in parallel). Finally, we conducted an assimilation experiment using actual observational data of sea surface wind fields to preliminarily validate the reasonableness of the assimilation results from the optimized SVD-3DEnVar scheme.

Received: 20 Sep 2025 – Discussion started: 14 Nov 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Kun Liu, Daosheng Xu, Fei Zheng, Juanxiong He, Chun Li, Jeremy Cheuk-Hin Leung, Mingyang Zhang, Dingchi Zhao, Quanjun He, Yuewei Zhang, Yi Li, and Banglin Zhang

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-4632', Anonymous Referee #1, 17 Nov 2025

The SVD-3/4DVar method, proposed by Qiu et al. (2007), is considered a pioneering achievement in the field of four-dimensional ensemble variational data assimilation (4DEnVar). This manuscript provides a valuable exploration of the practical application of the SVD-3DVar method and demonstrates certain innovative merits, meeting the publication criteria of this journal. The following suggestions are provided for further improvement:

1. Regarding the choice of methodology, it is recommended that you explain why SVD-4DVar was not adopted in favour of SVD-3DVar, while briefly analysing the core challenges of the latter.

2. It should be noted that the singular value decomposition (SVD) of matrix A is extremely challenging in practice due to its large dimensions (Nx+Ny), posing significant difficulties in terms of both storage and computation. A discussion on this aspect is recommended.

3. While equation (11) provides the Gaussian weight function, the localization scheme used in SVD-3DVar should be presented in more detail to enhance the completeness of the paper.

4. In recent years, 4DEnVar methods have advanced rapidly. To reflect an up-to-date understanding of the field, it is advisable to include references to relevant studies published between 2022 and 2025.

5. Regarding the generation of initial samples, several classical works (e.g. those by Evensen) have achieved high memory efficiency. It would be beneficial to reference these works and discuss their relevance to the present method. From a practical perspective, the main computational burden in parallelisation typically lies in the ensemble forecast component, which should also be addressed.

6. As this is an ensemble-based method, it is recommended that the ensemble sample update strategy in SVD-3DVar is explained briefly to improve the completeness of the methodological description.

Citation: https://doi.org/10.5194/egusphere-2025-4632-RC1
- AC1: 'Reply on RC1', Kun Liu, 22 Nov 2025
  
  Response to Reviewer 1
  
  We sincerely thank the reviewer for the insightful comments and constructive suggestions, which have significantly helped us improve the quality of our manuscript. We have carefully considered all points and have revised the manuscript accordingly. Below, we provide a point-by-point response to the comments.
  
  Comment 1:
  Regarding the choice of methodology, it is recommended that you explain why SVD-4DVar was not adopted in favor of SVD-3DVar, while briefly analyzing the core challenges of the latter.
  Response:
  We appreciate the reviewer’s suggestion. The decision to adopt SVD-3DEnVar instead of SVD-4DEnVar was primarily motivated by two factors:
  Computational Efficiency and Operational Feasibility: The main objective of this study is to improve computational efficiency for operational typhoon forecasting, so the relatively simple 3DVar scheme is chosen for experimentation...
  Observation Frequency: The sea surface wind observations used in our real-data experiments are available only every 6 hours. This low temporal resolution does not fully leverage the temporal continuity advantages of 4DEnVar.
  
  We acknowledge that SVD-3DEnVar has its own challenges, which we now discuss in the revised manuscript (Section 5). The primary limitations include: The use of a single localization scale, which may not optimally handle multi-scale observations (e.g., surface, satellite, and radar); The limited ability of linear combinations of singular vectors to represent strongly nonlinear relationships, especially for unobserved variables. These challenges will guide our future work on multi-scale assimilation and machine-learning-enhanced methods.
  
  Comment 2:
  It should be noted that the singular value decomposition (SVD) of matrix A is extremely challenging in practice due to its large dimensions (N_x + N_y), posing significant difficulties in terms of both storage and computation. A discussion on this aspect is recommended.
  Response:
  
  We fully agree with the reviewer. To address the computational and storage challenges of performing SVD on the large matrix A, we implemented a local patch assimilation strategy. This approach significantly reduces the effective dimensions of A by horizontal and vertical localization. Only observations within a specified horizontal (l_h) and vertical (l_v) radius from the central grid point are included. As a result, both N_x (model variables in the local patch) and N_y (observations within the local patch) are drastically reduced. This makes the SVD computationally feasible without sacrificing the flow-dependent covariance information. We have added a clarification in Section 3.1 to explain this strategy.
  
  Comment 3:
  While equation (11) provides the Gaussian weight function, the localization scheme used in SVD-3DVar should be presented in more detail to enhance the completeness of the paper.
  Response:
  Thank you for this suggestion. We have expanded the description of the localization scheme in Section 3.1. The revised text now reads:
  
  “The Gaussian weight function defined in Equation (11) is applied to each observation within the local patch. The horizontal and vertical localization scales (σ_h and σ_v) control the rate at which observation influence decays with distance. This localization ensures that only observations within a specified radius significantly impact the analysis increment at the center of the local patch, thereby mitigating spurious long-range correlations and improving the stability and accuracy of the assimilation.”
  
  Comment 4:
  In recent years, 4DEnVar methods have advanced rapidly. To reflect an up-to-date understanding of the field, it is advisable to include references to relevant studies published between 2022 and 2025.
  Response:
  We thank the reviewer for this suggestion. We have now incorporated several recent references on 4DEnVar advancements in the Introduction and Conclusion sections, including:
  Inverarity et al. (2023) on hybrid En-4DEnVar in the Met Office system;
  Berre and Arbogast (2024) on hybrid covariances at Météo-France;
  Lu and Wang (2024) on scale-dependent localization in hurricane forecasting;
  Thiruvengadam and Wang (2025) on convective-scale 4DEnVar;
  Wang et al. (2025) on CubeSat radiance assimilation.
  
  These additions help contextualize our work within the evolving landscape of ensemble-variational methods.
  
  Comment 5:
  Regarding the generation of initial samples, several classical works (e.g. those by Evensen) have achieved high memory efficiency. It would be beneficial to reference these works and discuss their relevance to the present method. From a practical perspective, the main computational burden in parallelization typically lies in the ensemble forecast component, which should also be addressed.
  Response:
  We agree with the reviewer. The initial perturbation generation in SVD-3DEnVar is based on the Gaussian random field method introduced by Evensen (1994), which is known for its memory efficiency and statistical robustness. However, the original implementation was limited to 2D square grids with odd numbers of points, which motivated our multi-dimensional and parallel optimizations. We have mentioned Evensen’s foundational work and clarify how our optimizations build upon it.
  
  Regarding the computational burden of ensemble forecasts: yes, the ensemble integration is indeed the most computationally intensive part in parallel implementations. However, since ensemble members are independent, they can be run concurrently if sufficient computational resources are available. While this study focuses on optimizing the perturbation generation and assimilation steps, we acknowledge the resource demands of ensemble forecasting and will address this in future work.
  
  Comment 6:
  As this is an ensemble-based method, it is recommended that the ensemble sample update strategy in SVD-3DVar is explained briefly to improve the completeness of the methodological description.
  Response:
  We have added the following explanation to Section 1:
  “Unlike traditional ensemble assimilation schemes (such as EnKF), which directly update each ensemble member during cyclic assimilation, SVD-3DEnVar only assimilates and updates the control forecast. Therefore, after each assimilation cycle, the updated analysis field must be perturbed again to generate the initial conditions for the next cycle's ensemble forecast. This approach has the advantage of avoiding filter divergence and, in the presence of model errors, performs better than EnKF (Qiu et al., 2007).”
  
  Citation: https://doi.org/10.5194/egusphere-2025-4632-AC1
CC1:
'Comment on egusphere-2025-4632', Nima Zafarmomen, 03 Dec 2025
The paper targets computational bottlenecks in the SVD-3DEnVar data assimilation scheme and proposes two main engineering improvements: (i) a redesigned perturbation-field generation algorithm that directly produces 3D, grid-conforming perturbations with multi-process parallelism and without intermediate I/O, and (ii) a parallel local-patch assimilation framework that partitions work, balances load, and reduces memory duplication via node-level pointer sharing. Using the TRAMS 3.0 model, the authors report large wall-clock reductions: 3D perturbation generation from ~22 minutes to 2.2 seconds and end-to-end assimilation from ~1700 minutes (serial) to <15 minutes with 150 nodes, alongside preliminary OSSE and real-data tests that show reasonable analysis increments and some improvement in typhoon forecasts.
Overall, the paper tackles a very practical problem: making SVD-3DEnVar fast and memory-efficient enough for realistic, large-domain applications. I. recommed it for publication after considering these comments:
How do you prescribe and verify the target covariance statistics of the perturbation fields (variance, horizontal and vertical correlation lengths, cross-variable correlations)?

Are perturbations generated multivariately (i.e., with controlled cross-variable balance), or independently per variable with post hoc smoothing?

What are the precise observation operator and error model used for the 10 m sea-surface winds (e.g., stability-dependent surface-layer mapping, bias correction, representativeness error)? How are coastal/land points handled, and are rainy scenes screened?

Please clarify the SVD/variational notation: define Λp explicitly, correct the SVD equation symbols, and detail the truncation criterion for K (energy, cross-validated error, or fixed)?

I storngly recommend to expand your introduction and cite below paper
Assimilation of sentinel‐based leaf area index for modeling surface‐ground water interactions in irrigation districts

Abstract:

Consider explicitly stating that the main novelty is the computational optimization and parallelization that makes SVD-3DEnVar suitable for operational use, plus a first real-data test with satellite-derived sea surface winds.

Notation consistency:

Ensure all symbols in the equations are defined once and consistently (e.g., Λ vs Λ_P in Equation (10); clarify if Λ_P is the same eigenvalue matrix as in Eq. (4) or modified).

Make sure the dimension of vectors and matrices in Eqs. (1)–(10) is always clear (model vs observation subspaces).

Equation (11) Gaussian localization:

Check the condition in the piecewise definition:

You currently use (r_h ≤ l_h and r_v ≤ l_v) vs (r_h > l_h or r_v ≥ l_v). It might be clearer and more symmetric to use strict/≤ consistently.

Consider giving an example of actual physical localization scales (in km) corresponding to l_h, l_v, σ_h, σ_v.
Citation: https://doi.org/10.5194/egusphere-2025-4632-CC1
- AC2: 'Reply on CC1', Kun Liu, 14 Dec 2025
  
  Response to Reviewer 2 (Dr. Nima Zafarmomen)
  We sincerely thank the reviewer for the thorough and constructive comments, which have significantly helped us improve the manuscript. Below, we provide point-by-point responses and describe the corresponding revisions made.
  Comment 1:
  
  How do you prescribe and verify the target covariance statistics of the perturbation fields (variance, horizontal and vertical correlation lengths, cross-variable correlations)?
  Response:
  
  The perturbation fields are generated following Evensen (2003). First, a set of smooth two-dimensional random perturbation fields with zero mean and unit variance is generated. Then, a fixed weighted blending scheme is applied (40% from the perturbation information of adjacent filled layers, 60% from newly generated two-dimensional random fields) to produce three-dimensional perturbation fields with a certain vertical correlation scale. After superimposing different three-dimensional perturbations for each variable, the fields are integrated forward for a period of time to allow the perturbations between different variables to develop reasonable multivariate correlations (e.g., physically consistent relationships between temperature and pressure).
  The control of variance and horizontal correlation scale for the two-dimensional random perturbation fields is based on the following principles:
  For a continuous two-dimensional field q = q(x,y), its Fourier transform can be written as:
  
  q(x,y) = ∫∫_{-∞}^{∞} ̂q(k) e^{i k·x} dk (R1)
  where ̂q(k) are the Fourier coefficients, the wavenumber vector k is defined as k = (κ_l, γ_p), and κ_l and γ_p are the wavenumbers in the x and y directions, respectively.
  Discretizing on an N × M horizontal grid:
  
  q(x_n, y_m) = Σ_{l,p} ̂q(κ_l, γ_p) e^{i(κ_l x_n + γ_p y_m)} Δk (R2)
  where x_n = nΔx, y_m = mΔy, κ_l = (2π l) / (x_N) = (2π l) / (NΔx), γ_p = (2π p) / (y_M) = (2π p) / (MΔy), and Δk = Δκ Δγ = ( (2π)^2 ) / (N M Δx Δy).
  Assuming the Fourier coefficients take the form:
  
  ̂q(κ_l, γ_p) = (c / √(Δk)) e^{- (κ_l^2 + γ_p^2) / σ^2} e^{2π i φ_{l,p}} (R3)
  where c is a normalization constant controlling the variance, σ is a bandwidth parameter determining the correlation length, and φ_{l,p} ∈ [0,1] is a uniformly distributed random number.
  Substituting (R3) into (R2) yields the spatial field expression:
  
  q(x_n, y_m) = Σ_{l,p} (c / √(Δk)) e^{- (κ_l^2 + γ_p^2) / σ^2} e^{2π i φ_{l,p}} e^{i(κ_l x_n + γ_p y_m)} Δk (R4)
  This can be interpreted as multiplying a Gaussian-shaped filter (spectral window) and random phase in wavenumber space, followed by an inverse Fourier transform to obtain the spatial random field. Therefore, the horizontal correlation length and variance of the two-dimensional random perturbation field can be controlled by adjusting the parameters σ and c.
  Reference:
  
  Evensen G. 2003. The Ensemble Kalman Filter: Theoretical Formulation and Practical Implementation, Ocean Dynamics 53, 343--367.
  Comment 2:
  
  Are perturbations generated multivariately (i.e., with controlled cross-variable balance), or independently per variable with post hoc smoothing?
  Response:
  
  Currently, perturbations for each variable are generated independently. By integrating the independently generated perturbations added to the model initial conditions forward for a period of time, cross-variable covariance relationships develop among the perturbations.
  Comment 3:
  
  What are the precise observation operator and error model used for the 10 m sea-surface winds (e.g., stability-dependent surface-layer mapping, bias correction, representativeness error)? How are coastal/land points handled, and are rainy scenes screened?
  Response:
  
  (a) Observation operator: The 10 m wind speed (u_10, v_10) is diagnosed from the wind speed (u_x, v_x) at the model's first layer height z_x, and then horizontally interpolated to observation locations. The diagnosis considers atmospheric stability through the dimensionless wind shear function Φ_x (the stability function in Monin--Obukhov similarity theory). The formulas are:
  
  { u_10 = u_x * (Φ_10 / Φ_x)
  
  v_10 = v_x * (Φ_10 / Φ_x) } (R5)
  where Φ_10 and Φ_x are the dimensionless wind gradient functions at 10 m height and the model's first layer height z_x, respectively.
  Under different stability conditions, Φ_x is expressed as:
  
  Φ_x = {
  
  -10 ln(z_x / z_0), (Ri_b > Ri_c = 0.2)
  
  -5 * (Ri_b / (1.1 - 5 Ri_b)) * ln(z_x / z_0), (Ri_c ≥ Ri_b > 0)
  
  0, (Ri_b = 0)
  
  2 ln( (1+x) / x ) + 2 ln( (1+x^2) / x ) - 2 tan^{-1}(x) + π/2, (Ri_b < 0)
  
  } (R6)
  where
  
  x = (1 - 16z / L)^{1/4} (R7)
  Here, Ri_b is the bulk Richardson number, Ri_c is a threshold for strongly stable conditions, and z_0 is the surface roughness.
  (b) Bias correction: The satellite-derived sea surface wind data have been bias-corrected prior to assimilation.
  (c) Representativeness error: In the SVD-3DEnVar scheme, only the leading N singular vectors are retained when fitting observation increments, which effectively truncates short-wave information in observations. Therefore, the observation representativeness error has minimal impact on the assimilation results.
  (d) Coastal/land treatment: The model uses static surface data to distinguish between ocean and land grid points. Over land, roughness length z_0 is specified based on land cover type. Over ocean, z_0 is diagnosed from sea surface wind speed using the Charnock (1955) relation:
  
  z_0 = z_ch * (u_*^2 / g) + 0.00001 (R8)
  
  where z_ch is the Charnock coefficient, u_* is the friction velocity, and g is gravitational acceleration. Differences in roughness calculation between land and sea surfaces further influence the diagnosis of 10 m winds through Eq. (R6).
  Reference:
  
  Charnock, H. (1955), Wind stress on a water surface. Q.J.R. Meteorol. Soc., 81: 639-640. https://doi.org/10.1002/qj.49708135027
  (e) Rain screening: The satellite-retrieved sea surface wind product used in this study is a blended product that incorporates longer-wavelength microwave radiometer data with better cloud-penetration capability, thereby mitigating the impact of rainfall on wind retrievals.
  Comment 4:
  
  Please clarify the SVD/variational notation: define Λ_p explicitly, correct the SVD equation symbols, and detail the truncation criterion for K (energy, cross-validated error, or fixed)?
  Response:
  
  (a) Λ_p has been corrected to Λ_K, where K is the truncation order. The definition of Λ_K has been added in the manuscript: "Λ_K is the diagonal matrix consisting of the first K largest singular values of the ensemble perturbation matrix A."
  (b) Selection of truncation order K:
  
  K must be less than the rank of A (i.e., K < r = rank(A)) and cannot exceed the ensemble size M (since only the first r singular values are non-zero, and M limits the maximum effective dimension of ensemble perturbations). Additionally, truncation should retain the "dominant variance" of the ensemble perturbations, typically requiring that the sum of squares of the first K singular values accounts for ≥95% of the total variance from all non-zero singular values. In this study, with an ensemble size of 30, the truncation order is fixed at K = 27.
  Comment 5:
  
  I strongly recommend to expand your introduction and cite below paper.
  
  Assimilation of sentinel‐based leaf area index for modeling surface‐ground water interactions in irrigation districts
  Response:
  
  The suggested reference has been added at the end of the first paragraph in the Introduction:
  
  "Beyond meteorological forecasting, data assimilation has also been effectively applied in hydrological and environmental modeling to integrate multi-source observations, such as combining satellite-derived vegetation indices with in-situ measurements to improve the analysis of land surface and subsurface processes (e.g., Zafarmomen et al., 2024)."
  Comment 6:
  
  Abstract: Consider explicitly stating that the main novelty is the computational optimization and parallelization that makes SVD-3DEnVar suitable for operational use, plus a first real-data test with satellite-derived sea surface winds.
  Response:
  
  The abstract has been revised to emphasize the novelty:
  
  "To bridge this gap towards operational readiness, this study introduces key computational optimizations: a new three-dimensional perturbation field generation scheme that supports multi-process parallelism and can directly generate any specified grid, and an efficient parallel implementation scheme tailored for the local patch assimilation in the SVD-3DEnVar scheme."
  Comment 7:
  
  Notation consistency:
  
  Ensure all symbols in the equations are defined once and consistently (e.g., Λ vs Λ_P in Equation (10); clarify if Λ_P is the same eigenvalue matrix as in Eq. (4) or modified).
  
  Make sure the dimension of vectors and matrices in Eqs. (1)--(10) is always clear (model vs observation subspaces).
  Response:
  
  The notation regarding Λ_p has been corrected as explained in the response to Comment 4. All equation symbols have been reviewed for consistency, and vector/matrix dimensions (model vs. observation subspaces) are explicitly stated in the revised manuscript.
  Comment 8:
  
  Equation (11) Gaussian localization:
  
  Check the condition in the piecewise definition:
  
  You currently use (r_h ≤ l_h and r_v ≤ l_v) vs (r_h > l_h or r_v ≥ l_v). It might be clearer and more symmetric to use strict/≤ consistently.
  
  Consider giving an example of actual physical localization scales (in km) corresponding to l_h, l_v, σ_h, σ_v.
  Response:
  
  Equation (11) in manuscript has been revised to:
  
  w(σ_h, σ_v) = {
  
  exp( - (r_h^2 / σ_h^2) - (r_v^2 / σ_v^2) ), (r_h ≤ l_h and r_v ≤ l_v)
  
  0, (r_h > l_h or r_v > l_v)
  
  }
  In this study, l_h is set to 10 grid points (approximately 90 km), and l_v equals the number of model layers. Both σ_h and σ_v are set to 3 grid points, corresponding to about 27 km and 1.5 km, respectively. These parameters are detailed in Section 4.1. Note that grid counts are used as units for convenience in implementation. These are preliminary settings; future work will refine them and develop more effective optimization methods for data assimilation.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4632-AC2
RC2:
'Comment on egusphere-2025-4632', Anonymous Referee #2, 28 Dec 2025

= General Comments =
This manuscript presents a technical advancement in the SVD-3DEnVar data assimilation scheme, focusing primarily on improving computational efficiency through a newly proposed three-dimensional ensemble perturbation generation method and a parallelization strategy. The authors evaluate this optimized framework using the TRAMS 3.0 model in both OSSE and real-data experiments.
The practical contribution of this study is evident, particularly the reported reduction in wall-clock time (from approximately 1,700 minutes to less than 15 minutes), which is highly relevant for operational numerical weather prediction. However, the manuscript requires substantial improvement in its scientific presentation. Several critical issues were identified regarding the mathematical rigor of the methodology, the theoretical consistency of the OSSE results, and the clarity of the experimental design. Specifically, the explanation of the load-balancing mechanism within the parallel strategy is vague, and multiple inconsistencies between the text and figures undermine the credibility of the findings.
Therefore, I recommend Major Revisions before this manuscript can be considered for publication.
= Specific Comments =
1. Clarification on Parallelization and Load Balancing (Line 255)

The authors state: "The approach adopted in this study allows for synchronous invocation of all processes... effectively avoiding the problem of idle waiting... thus significantly improving resource utilization."
This explanation is scientifically insufficient. "Synchronous invocation" implies starting tasks simultaneously, but it does not inherently solve the computational load balancing problem. In a local patch-based domain decomposition, observations are rarely uniformly distributed. Consequently, using a static domain decomposition inevitably results in some processes handling significantly more observations than others. How does "synchronous invocation" prevent processes with fewer observations from finishing early and idling?

The authors seem to imply that a task redistribution mechanism is in place, but it is not clearly described. A detailed explanation of how the workload is balanced across processes is required (e.g., are grid points dynamically redistributed based on observation density or computational cost?).
2. Theoretical Validity of OSSE Results (Figure 7)

In Figure 7, the analysis increment for the u-wind component (7b) appears almost identical to the "true error" (7a) in both spatial pattern and magnitude. According to data assimilation theory, the analysis increment is the product of the Kalman gain and the innovation (observation minus background). Unless the observation error (R) was set to zero or the background error covariance (B) was inflated to an unrealistic magnitude, the analysis increment should not match the true error so perfectly. This result raises serious questions about the experimental setup or the plotting (e.g., potential confusion between innovation and increment). The authors must verify this result and provide a physical or theoretical justification.

3. Ambiguity in Observation Processing (Line 108)

The manuscript mentions: "valid data are selected according to program input requirements after quality control."

This description is too vague for a research article. How exactly are "valid data" selected? Does this process involve spatial thinning, super-obbing, or specific domain checks? A concrete description of the selection criteria is necessary to ensure reproducibility.
4. Mathematical Rigor and Definitions (Section 3.1)

The mathematical description of the SVD-3DEnVar scheme lacks precision:
- Line 135: The matrix AA is rectangular; therefore, it has singular values, not "eigenvalues."
- Undefined Symbols: In Eq. (4) (A=BΛV^T), the term V^T is not defined. Similarly, in Eq. (6) (x=bα), the variable αα appears without a proper definition. The authors should explicitly define these variables to aid reader understanding.
5. Overgeneralization of Results (Figure 8)

Based on a single OSSE case shown in Figure 8, the authors claim that "the SVD-3DEnVar scheme can significantly improve typhoon track and intensity forecasts." This statement is too strong for a single idealized experiment. It would be more appropriate to state that the scheme demonstrates potential for improvement in this specific case.
6. Interpretation of “Operational Feasibility”

The manuscript frequently emphasizes the operational applicability of the proposed scheme based on reduced wall-clock time. However, operational feasibility depends not only on speed but also on resource availability and system stability. Given that some results rely on a large number of computing nodes (up to 300 nodes), it would be beneficial for the authors to discuss whether similar performance gains can be achieved under more constrained computational resources, which is a common reality for many operational centers.
= Technical Corrections and Visualization Issues =
1. Figure 4: The red boxes representing parallel domains are difficult to distinguish because the model grid points are also represented by lines. I suggest representing the model grid points as dots and using lines only for the red boxes to improve visual clarity.
2. Figure 10 Issues:
- Visibility: The black line indicating the cross-section in Fig. 10a is obscured by the wind vectors. Please use a contrasting color (e.g., gray or magenta) or increase the line thickness.
- Physical Interpretation: The cross-section passes through the typhoon center. It is physically puzzling why the u-wind analysis increments (Fig. 10b) appear overwhelmingly positive across the center.
- Mismatches: The caption for Fig. 10d states it shows v-wind, but the unit label in the image is ππ (dimensionless pressure). Additionally, the caption describes panels up to (j), but the layout and labels need to be checked for consistency.
3. Section Titles: The titles for Section 4.2 and Section 4.3 are identical ("Results of OSSE Experiment"). Section 4.3 discusses real-data assimilation and should be titled accordingly (e.g., "Results of Real-Data Assimilation Experiment").
4. Figure 11 Caption: The caption is too brief. It needs to be expanded to clearly describe what panels (a) and (b) represent (e.g., "Comparison of typhoon tracks (a) and maximum wind speed (b) between CTL and DA experiments...").
5. Typos and Grammar:
- Line 16: "This paper constructed..." → "constructs" (or presents/proposes).
- Line 83: "microphysical schem" → "scheme".
- Line 261: "Figure 5a showed..." → "shows".
- Line 286: "The results shows that..." → "The results show that...".
- Line 445: "assimilate atmospheric variable" → "atmospheric variables".

Citation: https://doi.org/10.5194/egusphere-2025-4632-RC2
- AC3:
  'Reply on RC2', Kun Liu, 01 Jan 2026
  Response to Reviewer 3
  
  We sincerely thank the reviewer for the thorough evaluation and insightful comments, which have helped us improve the manuscript. Below, we address each point raised.
  
  Comment 1：Clarification on Parallelization and Load Balancing (Line 255)
  The authors state: "The approach adopted in this study allows for synchronous invocation of all processes... effectively avoiding the problem of idle waiting... thus significantly improving resource utilization."
  This explanation is scientifically insufficient. "Synchronous invocation" implies starting tasks simultaneously, but it does not inherently solve the computational load balancing problem. In a local patch-based domain decomposition, observations are rarely uniformly distributed. Consequently, using a static domain decomposition inevitably results in some processes handling significantly more observations than others. How does "synchronous invocation" prevent processes with fewer observations from finishing early and idling?
  The authors seem to imply that a task redistribution mechanism is in place, but it is not clearly described. A detailed explanation of how the workload is balanced across processes is required (e.g., are grid points dynamically redistributed based on observation density or computational cost?).
  Response:
  Thank you for highlighting this crucial point. Our parallel strategy involves a two-step process to achieve load balancing.
  Initial Parallel Screening: The model grid is statically partitioned. Within each partition, all processes work in parallel to screen each grid point. The criterion for marking a point as "to be assimilated" is whether the number of valid observations within its local patch meets or exceeds the preset truncation order (K).
  Dynamic Task Redistribution: After this global parallel screening, all marked grid points are collected. These points (now a scattered set, not a regular grid) are then evenly redistributed among all available processes. This ensures that each process receives a nearly equal number of assimilation tasks, effectively balancing the computational load.
  
  Through the above two steps, the assimilation computational tasks of SVD-3DEnVar can be essentially evenly distributed across various nodes.
  
  We have added this detailed explanation to Section 3.3 of the manuscript.
  
  Comment 2: Theoretical Validity of OSSE Results (Figure 7)
  In Figure 7, the analysis increment for the u-wind component (7b) appears almost identical to the "true error" (7a) in both spatial pattern and magnitude. According to data assimilation theory, the analysis increment is the product of the Kalman gain and the innovation (observation minus background). Unless the observation error (R) was set to zero or the background error covariance (B) was inflated to an unrealistic magnitude, the analysis increment should not match the true error so perfectly. This result raises serious questions about the experimental setup or the plotting (e.g., potential confusion between innovation and increment). The authors must verify this result and provide a physical or theoretical justification.
  Response:
  We appreciate the reviewer's careful scrutiny. The close match is expected and validates the experimental design and algorithm behavior under the specific OSSE conditions:
  Dense, Perfect Observations: In the OSSE, direct observations of the u-wind component are provided at every model grid point within layers 8-32. This creates an exceptionally dense and spatially complete observing network for this variable.
  Methodology: For computational convenience, in the implementation described in this paper, we did not directly minimize the objective function (10) through iterative calculations. Instead, following the approach of Qiu and Chou (2005), we employed the method of least squares to solve for α:
  Δy = sum_{r=1}^K α_r b_r^d (11)
  By solving the algebra equations (11), the coefficients are obtained, and the required analysis increment is computed using Equation (7). In this SVD-3DEnVar implementation, the coefficients are determined solely from observational information. When sufficient observations are available (i.e., the number of observations exceeds or equals the truncation order K), the system is well-posed, a stable minimum-norm solution can be obtained because the solution is constrained to the low-dimensional subspace spanned by the singular vectors. This eliminates the dependence on background error statistics required in traditional methods and simplifies the assimilation process.
  
  On the other hand, since the SVD method only retains the first K eigenvectors to fit the observation increments, it effectively filters out small-scale observational increment information (corresponding to eigenvectors with relatively small eigenvalues). Therefore, even when random noise is added to the observations in the OSSE experiment, it does not significantly impact the assimilation results. This is one of the differences between the SVD method and traditional assimilation methods, namely its lower sensitivity to random observational errors.
  
  Reference: Qiu, C. and Chou, J.: Four-dimensional data assimilation method based on SVD: Theoretical aspect, Theor. Appl. Climatol., 83, 51–57, https://doi.org/10.1007/s00704-005-0162-z, 2005.
  
  Comment 3: Ambiguity in Observation Processing (Line 108)
  The manuscript mentions: "valid data are selected according to program input requirements after quality control."
  This description is too vague for a research article. How exactly are "valid data" selected? Does this process involve spatial thinning, super-obbing, or specific domain checks? A concrete description of the selection criteria is necessary to ensure reproducibility.
  Response:
  We agree that the description was too brief. For the specific experiments presented in this paper:
  OSSE Experiment: Simulated u-wind observations were used directly at their native model grid points without additional thinning or super-obbing.
  Real-Data Experiment (Sea Surface Winds): The multi-source satellite wind product (described in Section 2) underwent its quality control and bias correction during the retrieval process. For ingestion into our SVD-3DEnVar system, we applied a simple background check: observations were rejected if the absolute difference between the observation and the background (mapped to observation space) exceeded a threshold (set to 20 m/s for wind speed). No additional spatial thinning or supero-bbing was applied for these preliminary tests.
  
  The primary focus of this study was computational optimization. Future work incorporating higher-resolution data (e.g., radar) will require and implement more sophisticated preprocessing (e.g., thinning, super-obbing). We will clarify this in Section 4.1.
  
  Comment 4: Mathematical Rigor and Definitions (Section 3.1)
  
  The mathematical description of the SVD-3DEnVar scheme lacks precision:
  - Line 135: The matrix AA is rectangular; therefore, it has singular values, not "eigenvalues."
  - Undefined Symbols: In Eq. (4) (A=BΛV^T), the term V^T is not defined. Similarly, in Eq. (6) (x=bα), the variable αα appears without a proper definition. The authors should explicitly define these variables to aid reader understanding.
  Response:
  Thank you for catching these issues. We have corrected them in the manuscript.
  "Eigenvalues" has been replaced with "singular values" throughout.
  In the text following Equation (4), we have added: "...where Λ is a diagonal matrix of singular values..., and B and V are orthogonal matrices containing the left and right singular vectors of A, respectively."
  
  Following Equation (6), we have added: "...where α = (α₁, α₂, ..., α_K)^T is the vector of coefficients to be determined by minimizing the cost function."
  
  Comment 5: Overgeneralization of Results (Figure 8)
  Based on a single OSSE case shown in Figure 8, the authors claim that "the SVD-3DEnVar scheme can significantly improve typhoon track and intensity forecasts." This statement is too strong for a single idealized experiment. It would be more appropriate to state that the scheme demonstrates potential for improvement in this specific case.
  Response:
  We agree with the reviewer. The statement has been toned down as suggested. The relevant sentence in Section 4.2 now reads:
  
  "The DA experiment effectively reduces the forecast bias in both the typhoon track and intensity for this specific case. These results indicate that after assimilating the wind field, the SVD-3DEnVar scheme demonstrates the potential to improve typhoon track and intensity forecasts in this idealized framework."
  
  Comment 6: Interpretation of “Operational Feasibility”
  The manuscript frequently emphasizes the operational applicability of the proposed scheme based on reduced wall-clock time. However, operational feasibility depends not only on speed but also on resource availability and system stability. Given that some results rely on a large number of computing nodes (up to 300 nodes), it would be beneficial for the authors to discuss whether similar performance gains can be achieved under more constrained computational resources, which is a common reality for many operational centers.
  Response:
  This is a valuable point. Our scaling test up to 300 nodes was to explore the saturation point of parallelism. As seen in Fig. 5a, the performance gain (reduction in wall-clock time) slows significantly beyond ~150 nodes for our test configuration. More importantly, substantial gains are achieved with far fewer resources.
  Using approximately 30 nodes (with 64 cores each), the assimilation time is reduced to about 1 hour, which is already a dramatic improvement from the original serial execution.
  The current experiments assimilate very dense data (effectively all grid points in the OSSE). In an operational setting, realistic observations would be much sparser after standard quality control and thinning procedures. This would further reduce the computational cost, potentially allowing runtime targets (e.g., under 30 minutes) to be met with fewer than 60 nodes.
  
  Since operational ensemble prediction systems already require tens to hundreds of nodes to run the ensemble forecasts, dedicating a comparable subset for an efficient assimilation step is feasible and represents a significant step toward operational readiness. We will add a brief discussion on this in Section 5 (Conclusion).
  
  Comment 7: Technical Corrections and Visualization Issues
  Figure 4: The red boxes representing parallel domains are difficult to distinguish because the model grid points are also represented by lines. I suggest representing the model grid points as dots and using lines only for the red boxes to improve visual clarity.
  
  Response:
  
  Figure 4: We have changed the representation of the model horizontal grid points from a grid of lines to black crosses as stated in the response, which significantly improves the distinction from the red partition boxes.
  
  Figure 10 Issues:
  
  - Visibility: The black line indicating the cross-section in Fig. 10a is obscured by the wind vectors. Please use a contrasting color (e.g., gray or magenta) or increase the line thickness.
  Response:
  
  We have changed the wind vector arrows from black to a high-contrast purple to make the overlaid black cross-section line more visible without altering the line itself, maintaining consistency across subplots.
  
  - Physical Interpretation: The cross-section passes through the typhoon center. It is physically puzzling why the u-wind analysis increments (Fig. 10b) appear overwhelmingly positive across the center.
  Response:
  
  We apologize for the confusion. Figure 10b plots the increment in total wind speed, not the u-component. The background field underestimated the wind speed in the typhoon core (Fig. 9c), so a positive increment across the center is physically consistent for wind speed magnitude.
  
  - Mismatches: The caption for Fig. 10d states it shows v-wind, but the unit label in the image is ππ (dimensionless pressure). Additionally, the caption describes panels up to (j), but the layout and labels need to be checked for consistency.
  Response:
  
  We have corrected the caption and verified all labels. Panel (d) correctly shows the π (Exner pressure) increment. The reference to non-existent panels (j) has been removed.
  
  Section Titles: The titles for Section 4.2 and Section 4.3 are identical ("Results of OSSE Experiment"). Section 4.3 discusses real-data assimilation and should be titled accordingly (e.g., "Results of Real-Data Assimilation Experiment").
  
  Response:
  
  Section Titles: The title of Section 4.3 has been changed to "4.3 Assimilation of Sea Surface Wind Observations".
  
  Figure 11 Caption: The caption is too brief. It needs to be expanded to clearly describe what panels (a) and (b) represent (e.g., "Comparison of typhoon tracks (a) and maximum wind speed (b) between CTL and DA experiments...").
  
  Response:
  
  Figure 11 Caption: The caption has been expanded as suggested: " Figure 11: Comparison of (a) typhoon tracks and (b) 10‑m maximum wind speed (units: m s⁻¹) among the control experiment (CTL), the data assimilation experiments (DA1, DA2, DA3), and observations (OBS) for Typhoon Yagi (2024). Forecasts start at 00:00 UTC on 6 September 2024."
  
  Typos and Grammar:
  
  - Line 16: "This paper constructed..." → "constructs" (or presents/proposes).
  - Line 83: "microphysical schem" → "scheme".
  - Line 261: "Figure 5a showed..." → "shows".
  - Line 286: "The results shows that..." → "The results show that...".
  - Line 445: "assimilate atmospheric variable" → "atmospheric variables".
  Response:
  All noted corrections have been made in the manuscript ("constructs", "scheme", "shows", "show", "variables"). We have also performed a thorough check for tense and grammar throughout the paper.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4632-AC3

Kun Liu, Daosheng Xu, Fei Zheng, Juanxiong He, Chun Li, Jeremy Cheuk-Hin Leung, Mingyang Zhang, Dingchi Zhao, Quanjun He, Yuewei Zhang, Yi Li, and Banglin Zhang

Viewed

Total article views: 548 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
390	120	38	548	22	24

HTML: 390
PDF: 120
XML: 38
Total: 548
BibTeX: 22
EndNote: 24

Views and downloads (calculated since 14 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	123	16	13	152
Dec 2025	81	46	18	145
Jan 2026	140	28	7	175
Feb 2026	41	27	0	68
Mar 2026	5	3	0	8

Cumulative views and downloads (calculated since 14 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	123	16	13	152
Dec 2025	81	46	18	145
Jan 2026	140	28	7	175
Feb 2026	41	27	0	68
Mar 2026	5	3	0	8

Viewed (geographical distribution)

Total article views: 534 (including HTML, PDF, and XML) Thereof 534 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 02 Mar 2026

Short summary

The Singular Value Decomposition-three Dimensional Ensemble Variational data assimilation scheme is applied for the first time in the Tropical Regional Atmospheric Model System. With optimized three-dimensional perturbation generation and parallel strategies, computational costs were greatly reduced. Results indicate that the optimized scheme maintains reasonable accuracy while achieving much higher efficiency, suggesting good potential for practical forecasting use.


Total:	0
HTML:	0
PDF:	0
XML:	0