the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multilevel Monte Carlo methods for ensemble variational data assimilation
Abstract. Ensemble variational data assimilation relies on ensembles of forecasts to estimate the background error covariance matrix B. The ensemble can be provided by an Ensemble of Data Assimilations (EDA), which runs independent perturbed data assimilation and forecast steps. The accuracy of the ensemble estimator of B is strongly limited by the small ensemble size that is needed to keep the EDA computationally affordable. We investigate here the potential of the multilevel Monte Carlo (MLMC) method, a type of multifidelity Monte Carlo method, to improve the accuracy of the standard Monte-Carlo estimator of B while keeping the computational cost of ensemble generation comparable. MLMC exploits the availability of a range of discretization grids, thus shifting part of the computational work from the original assimilation grid to coarser ones. MLMC differs from the mere averaging of statistical estimators, as it ensures that no bias from the coarse resolution grids is introduced in the estimation. The implications for ensemble variational data assimilation systems based on EDAs are discussed. Numerical experiments with a quasi-geostrophic model demonstrate the potential of the approach, as MLMC yields more accurate background error covariances and reduced analysis error. The challenges involved in cycling a multilevel variational data assimilation system are identified and discussed.
- Preprint
(1435 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-3628', Anonymous Referee #1, 22 Jan 2025
The authors discussed the background error covariance estimation using (weighted) multi-level Monte Carlo (wMLMC) method in variational data assimilation (DA). The authors discussed several practical considerations when MLMC is sued to estimate a covariance matrix: 1) the mean squared error and variance of a covariance MLMC estimator; 2) computational budget allocation; 3) localisation and positive definiteness of the estimated covariance matrix. The MLMC estimator and the performance of corresponding 3DEnVar is investigated by a two-dimensional two-layer quasi-geostrophic channel model after 12 hour forecast from an initial ensemble without data assimilation cycles. The paper is well-written and is worth publication.
Major comments:
1. The Experimental setting section may be benefit from a figure to illustrate the model setup.
2. Current results are all built on a single dynamical snapshot of the model. Optionally, is it possible to build a stronger case by running a long deterministic trajectory of the model, and a few select different time step with very different features of dynamics as initial condition to generate ensemble with 12 hour forecast such that the computational cost does not drastically increase?
3. When the ensemble member allocation is tuned based on Eq. (14) and (16), do we expect that the a^(k) and b^(k), or C^(k) change significantly due to the flow-dependency of ensemble forecasting? The authors suggest to change the ensemble allocation less frequently (L573 - L575). Do we expect that MSE of estimated B matrix at least as accurate as high-resolution B matrix, i.e. the MC method?
4. Using the B matrix from MLMC and MC shows, the results show that the analysis has limited improvements. Could this be related to the smooth streamfunction in the QG model? Would we expect significant differences for other fields, e.g., PV?
Minor comments:L5: "...affordable We investigate..." -> "...affordable. We investigate..."
L57: "...the Ensemble Kalman Filter(..." -> "the ensemble Kalman filters ("
L87: "the composition operator" -> "a composition operator"
L159: Perhaps a set should be represented with curly brackets?
L160: "...stochastic inputs are all independent..." -> "... number of stochastic inputs are all independent..."
L221: The author states that "...related to small fourth-order moments of the correction terms, and so to strong correlations between stochastically-coupled simulations...". Does this mean that adjacent level of model must yield similar outcome? How close should be these levels? Does this also justify the 0 value for level 0?
L270: "corrections term" -> "correction term"
L318: "1:3 ratio..." -> "1:3 ratio between the width and length of the domain..."
L443: "apply them to a Dirac impulse" reads as if the estimator is used to estimate the covariance of a Dirac impulse, which is not the case, I think. This also means that Fig. 4 and 5 are covariance of one grid point with the entire domain.
L477: Here, can the authors briefly explain why the cost is proportial to the ensemble member instead of grid size?
L534: "...is to remain..." -> "...remains..."
L553: What is the B norm?
L641: alpha should be bold?Citation: https://doi.org/10.5194/egusphere-2024-3628-RC1 -
AC1: 'Reply on RC1', Mayeul Destouches, 10 Mar 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3628/egusphere-2024-3628-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Mayeul Destouches, 10 Mar 2025
-
RC2: 'Comment on egusphere-2024-3628', Alban Farchi, 24 Jan 2025
In this manuscript, the authors propose to use multilevel Monte Carlo methods to estimate the background error covariance matrix (B) that is used in a variational data assimilation method such as 3D- and 4D-Var. The authors first describe the methodology (how to estimate B using first regular then multi-level Monte Carlo methods) in a pedagogical way. They then show numerical experiments using a low-order model, illustrating the increased accuracy of the estimated B matrix and the resulting increased accuracy of the analysis in a 3D-Var assimilation system.
The manuscript is very well written is easy to follow, even though some of the developments can be technical from time to time. I would recommend publication after minor revisions to address a couple remarks.
General comments- Some sub-sections, for example in section 3, are substantial and, even though I do not think that they should be shortened, perhaps they could be split into sub-sub-sections.
- Small impact in data assimilation experiments (last paragraph of section 5): I have the feeling that this discussion could be extended. Do you think that the conclusions would change in a cycled data assimilation context?
- From figure 1, some numerical artifacts are clearly visible in all grids. You mention that, for l=1 and 2, this is consistent with the fact that the bicubic interpolation scheme does not conserve spatial derivatives. Would this issue be mitigated by using higher order interpolation tools? For l=3 and 4, the numerical artifacts seem to be related to the boundary conditions at the south and north poles. Is this "worrying"?
Technical remarks and comments- L 122 reference to Eq. (3) -> I think you mean the equation (without number) at L 120
- L 130 Please define what you mean by a "correlation matrix"
- L 150-152 "We assume that the models are ordered from the least (fidelity level l = 1) to the most accurate (fidelity level l = L), from the computationally cheapest (l = 1) to the most computationally expensive (l = L)." What about the (unlikely) case where model A is more accurate but les computationally expensive than model B?
- L 246 Is "natural estimator" the most appropriate term here? What about "standard estimator" or "canonical estimator"?
- L 248-249 "but we have preferred using the biased versions here for the sake of simplicity" what do you mean exactly by "here"? In the numerical experiments of the following sections?
- L 318-319 "with data points defined at the nodes of the grid" Does it really make a difference if the data points are at the nodes or at the centre of the grid?
- L 324 "centred on a grid point" In American English (which seems to be the convention used in the present manuscript) this should be "centered on a grid point".
- L 331-332 The code availability statement is sufficient in my opinion. You could remove these two lines.
- L 336 "Performance is measured from the analysis error with respect to a truth run." The term "analysis error" is a bit vague. I would suggest to use "analysis RMSE" (which is the metric used in Section 5.4). Also, "a truth run" could imply that for a single experiment there are multiple "truths", I would hence suggest to reformulate to "the truth".
- L 346-347 "from a Gaussian covariance model" For clarity I would suggest to give the expression for this model.
- L 347 "horizontally" and "vertically" This is a bit confusing to me. Naively, I would assume that "horizontally" means in the x-y plane and "vertically" means in the z direction. However here we have only two model levels (ie only two nodes in the z direction), it is hence simpler (better?) to directly give the correlation between both levels and therefore, when reading this sentence I was wondering whether "horizontally" means in the x direction and "vertically" in the y direction. To conclude, in order to avoid potential confusion I would suggest to give the actual correlation between model levels in addition (or perhaps even instead of) to the vertical length scale.
- L 414-416 "we fine-tune the allocation by removing or adding members to ensure we stay below the target budget while having as many ensemble members as possible on each coupling group." How is this "fine-tuning" performed? Using a set of (deterministic) pre-defined rules?
- L 492-493 "Since localizing a covariance estimator makes it biased, preserving the unbiasedness of the wMLMC covariance estimator should not be our primary concern" Isn't it impossible to preserve the unbiasedness of the estimator when using localisation? One could understand from the second part of the sentence that it is possible in principle.
- L 524 "out of the critical path of a data assimilation suite" I would suggest to define what you mean here by "critical path".
- L 539 "...parametric B hybridization..." -> "parametric B, hybridization"
- L 642 "The contributions of the authors is given..." -> "The contributions of the authors are given..."
- Variances are sometimes written "\mathcal{V}" and sometimes "Var". Please be consistent.
- I strongly advise to give a number to all equations. Indeed, even if a specific equation is not very important and not mentioned in the present manuscript, readers (or authors of potential follow-up papers) may need to refer to it. Furthermore, every equation should end with a punctuation sign (",", ".", ";", etc.)
- Table 1: I would suggest to use the "booktabs" package, with no vertical separation between columns, but with top and bottom horizontal lines and with potential horizontal separators (eg here between row 1 and row 2).
- Table 2: same remarks as table 1. In addition, I would suggest to use right alignment for numbers (which makes it easier to compare rows). In the fourth column the time unit (min) could be included in the column header. For the last column, I would suggest to use the same number of digits per row (again, to make it easier to compare rows).
- Figure 1: a color bar is missing. The y labels could use the explicit value of l. The caption could mention "epsilon_1" and "epsilon_2" as the two ensemble members.
- Figure 2: "forecast cost" is called "normalized integration cost" previously, right?
- Figure 4: I would suggest to add a title to each subplot to indicate which estimate is plotted. Also, is the colormap properly centered on 0?
- Figure 5: Same remarks as for figure 4. In addition, I would suggest to reverse the colormap for the top two panels (this would be better when printing the page).
- Figure 6: The caption mention "shaded area“, but the figure uses filled areas.
- Figure 7: I would suggest to use 2 or 3 columns for the legend (to avoid hiding some part of the figure). The triple horizontal line at y=0 is a bit weird. I would also suggest to increase the size of the markers in the legend (as such, it is very difficult to distinguish between blue and purple).
- Figure 8: Whisker plots can be sometimes a bit deceiving. You could potentially use violin plots instead (also to be fair violin plots have their own limitations as well).Citation: https://doi.org/10.5194/egusphere-2024-3628-RC2 -
AC2: 'Reply on RC2', Mayeul Destouches, 10 Mar 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3628/egusphere-2024-3628-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Mayeul Destouches, 10 Mar 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
163 | 75 | 10 | 248 | 7 | 6 |
- HTML: 163
- PDF: 75
- XML: 10
- Total: 248
- BibTeX: 7
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 104 | 44 |
France | 2 | 46 | 19 |
China | 3 | 10 | 4 |
United Kingdom | 4 | 9 | 3 |
Norway | 5 | 7 | 3 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 104