the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Long-window hybrid variational data assimilation methods for chaotic climate models tested with the Lorenz 63 system
Abstract. A hybrid 4D-variational data assimilation method for chaotic climate models is introduced using the Lorenz '63 model. This approach aims to optimise an Earth system model (ESM), for which no adjoint exists, by utilising an adjoint model of a different, potentially simpler ESM. The technique relies on synchronisation of the model to observed time series data employing the dynamical state and parameter estimation (DSPE) method to stabilise the tangent linear system by reducing all positive Lyapunov exponents to negative values. Therefore, long windows can be used to improve parameter estimation. In this new extension a second layer of synchronisation is added between the two models, with and without an adjoint, to facilitate linearisation around the trajectory of the model without an adjoint. The method is conceptually demonstrated by synchronising two Lorenz '63 systems, representing two ESMs, one with and the other without an adjoint model. Results are presented for an idealised case of identical, perfect models and for a more realistic case in which they differ from one another. If employed with a coarser ESM with an adjoint, the method will save computational power as only one forward run with the full ESM per iteration needs to be carried out. It is demonstrated that there is negligible error and uncertainty change compared to the 'traditional' optimisation of full ESM with an adjoint. In a variation of the method outlined, synchronisation between two identical models can be used to filter noisy data. This reduces optimised parametric model uncertainty by approximately one third. Such a precision gain could prove valuable for seasonal, annual, and decadal predictions.
Status: open (until 03 Feb 2025)
-
RC1: 'Comment on egusphere-2024-3613', Anonymous Referee #1, 07 Jan 2025
reply
The authors present a new synchronisation strategy that can be used to perform parameter estimation using variational data assimilation method. The synchronisation strategy forms a new coupled system with two components. In this coupled system, one component of the system (component 1) has a nudging term between observation and model forecast of the component, and the other component of the system (component 2) is nudged towards model forecast of the component 1. As a motivation, authors argue that because the adjoint model of high-resolution models, in this context, component 1, can be difficult to obtain or expensive to run, one can perform parameter estimation using the adjoint model of low-resolution model alone (component 2). The SFDA experiment, if I understand correctly, to some extent, is similar to a weakly coupled data assimilation where model forecast is performed by a coupled system (e.g, atmosphere-ocean) and the data assimilation is performed for only one component of the system (e.g., atmosphere). Here, the authors perform experiments on a Lorenz 63 model.
The problem itself is interesting and is worth investigating. However, the manuscript raises concerns on both the quality of the scientific discussion and the presentation of the study. I encourage all authors carefully revise the manuscript to improve the readability and the scientific discussion.
Major comments:
1. In general, I feel the authors do not clearly state the status in the field of parameter estimation in climate models and the motivation of the study. For example, authors should clearly state the rationale behind the proposed framework and the problems being solved. Why do you need a long DA window for parameter estimation? Why do you think the new framework have any benefits? In the current formulation, the parameter estimation also requires a sensitivity matrix to the parameters. Would this be difficult to be obtained in complex climate models?2. The methodology section needs significant restructure. The authors should first briefly introduce the synchronisation method as a generic method instead of its L63 formulation. Then, the cost function of variational method should be introduced along with the gradient of the cost function. Finally, the authors should introduce the Lorenz 63 model, the exact formulation of the synchronisation model with the Lorenz 63 model, and terms in the cost function and its gradient Lorenz 63 model. The paper might also be benefited from an experiment setup (sub)section, which provide details of the choice of nudging strategy, the chosen value of observation noises/twin experiments setup, the metrics used etc. One of the given benefits of synchornisation approach is the possibility of using long DA window. However, the DA window of 100TUs is given at the very end of Sect. 3.2.
3. The authors need to check Eq. (7 - 15). The adjoint equations presented in this study are normally obtained when the cost functions are temporal integral because it involves integration by parts. Yet, the cost functions are given as discrete time summations. In the SFDA section, the cost function is given as the misfit between observations and x_a, presumably x_a has 3 elements. However, given Eq (8), the gradient of the cost function in Eq. (12) does not hold as Eq. (12a) is a differences between a 3 element vector and 6 element vector as M*_{SFDA} is a 6 x 6 matrix. Do authors define x_a = (x_f x_a), or is M*_{SFDA} given incorrectly? Moreover, it is unclear to me why do authors decide to provide the gradient of cost function (Eq. 13, Eq. 15) with respect to the parameter theta as an element-wise multiplication while it is supposed to be a matrix-vector product of dM/dtheta and lambda(t).
4. More interpretation is needed for results section. In Figure 4 and 7, with small alpha, the solid line does not look like the median of the ensemble, and I can only guess that all results lead to increased errors. Is this the case? Also, why does the parameter estimation perform better with increased alpha? Authors also need to provide better comparison and interpretation for the differences between the single, SFDA and HDA approach. What causes the need for different alpha? The lack of interpretation leads to very similar Sect. 3.1 and 3.2.
5. It will be good to look at the impact of different length of DA window on the parameter estimation, or Lyapunov exponents of the system. It may also be useful to check the performance when components of the syncrhonised system have different parameter values.
Detailed comments:
1. "a sequential data assimilation scheme (Bertino et al., 2003) and the variational approach (Le Dimet and Talagrand, 1986)." -> "...sequential and variational data assimilation schemes..."2. "defined as the quadratic misfit between the observational and model data within an assimilation time window" -> Usually, 4DVar cost function has a background term making it equivalent to a maximum likelihood problem in the view of Bayesian theorem under Gaussian assumption. It would be useful to distinguish the cost function here compared to more common cost function formulation
3. "Due to the nonlinearities within ESMs..." -> "The use of adjoint models face several challenges. Due to..."
4. "the problem can be mitigated by synchronisation which removes the non-linear or chaotic dynamics
from the adjoint model leading to a smooth cost function" -- I feel it might be better to phrase it as "...synchronisation which constructs a system with reduced sensitivity to initial conditions leading to ..."; Moreover, authors should discuss the benefits of long DA window especially for parameter estimations.5. "This method allows extension of" -- "This method allows for the extension of"
6. "To mitigate both problems, we propose a novel framework where we use two climate models both coupled through synchronisation, one with a high resolution and the other with coarse resolution for which an adjoint exists. " -- Here, is it the common that adjoint models of coarse resolution is available while the adjoint models of high-resolution models are not available? Authors should provide references and discussions on the existence of the issue. Further, authors should also discuss how this novel framework could mitigate the problem of smoothness and dimensionality.
7. "The objective of this paper is to quantify the precision and the benefit of such a synchronised data assimilation approach." -- I believe this is what you have done instead of the objective of the study. A better objective would be to investigate the performance of the novel approach you proposed.
8. "We perform this test conceptually using a Lorenz ’63 model system." -- The test is not performed "conceptually".
9. "The advantage is that it can be used to quantitatively evaluate the parameter dependence of the system prior to application in a full model." -- This sentence needs rephrasing. I guess the authors want to say "...quantitatively evaluate data assimilation schemes..." because the parameter dependence of a system will change for different dynamical systems. In fact, parameters of Lorenz 63 are non-dimensionalised numbers of a convection system. These parameters may not appear explicitly in a full climate model.
10. "It can also be used in a wide range of other applications (Du and Shiue, 2021; Cameron and Yang, 2019; Pelino and Maimone, 2007). " -- you might want to describe examples of these applications.
11. Eq. (1) describes the classic L63 model, which I believe should have a reference to the original paper by Lorenz.
12. "Sub-section" can be just "Section"
13. "The random noise value magnitudes are bounded by a given percentage relative to the systems’ standard distribution." --- Please provide more details of your random noise choices. This is for the sake of reproducibility and credibility of the research.
14. Eq. (2) is technically not the adjoint model/TLM. The TLM is defined as d {\delta x}/ dt = M {\delta x}. Also, matrices are conventionally given as bold capital letters and the matrix transpose operator should not be italic. The vector x as well as the dot operator is not defined here. In fact, I doubt the necessity of this equation as this study does not use this equation at all.
15. In Eq. (3), x_a, y_a, z_a are not defined. Considering that authors discuss the nudging of the z variable, would it be good to have alpha(z_o - z_a) term in Eq. (3) first? Moreover, this is an equation of synchronisation strategy specifically for Lorenz 63 model. Could the authors provide a general description of synchronisation before case-specific description? Also, is this the single model approach mentioned in Sect. 3? If this is the case, authors should clearly state it.
16. Again, I doubt the necessity of having Eq. (4).
17. "...hybrid data assimilation (HDA.)..." -> "...hybrid data assimilation (HDA). ..."
18. Eq. (11) is it always a gradient with respect to x_a? Should this depend on SFDA or HDA?
Citation: https://doi.org/10.5194/egusphere-2024-3613-RC1
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
38 | 0 | 0 | 38 | 0 | 0 |
- HTML: 38
- PDF: 0
- XML: 0
- Total: 38
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1