Comparison of two Euler equation sets in a Discontinuous Galerkin solver for atmospheric modelling (BRIDGE v0.9)
Abstract. The implementation of a 'classical' Discontinuous Galerkin (DG) solver for atmospheric flows is presented that is designed for efficient use in numerical weather prediction, climate simulations, and meteorological research both on the whole sphere and for limited area modeling. To this purpose the horizontally explicit, vertically implicit (HEVI) approach is used together with implicit-explicit (IMEX)-Runge-Kutta (RK) time integration schemes and a moderate spatial approximation order (order 4 or 5). Two Euler equation sets using mass, momentum and either density weighted potential temperature θ or total energy E as prognostic variables are compared by several idealised test cases. Details of the formulation of the Euler equations in covariant form using the Ricci tensor calculus, the linearisations needed for HEVI (especially for the total energy set), boundary conditions for an IMEX-RK scheme, and filtering for numerical stabilisation are given. Furthermore, the implementation of distributed memory parallelisation, the tensor product representation for prismatic grid cells, and optimisations for the HEVI formulation, are outlined. These developments lead to the so-called BRIDGE code, which will serve as a code base for a later DG extension of the well established ICON model. From the used idealised test cases, which are standard benchmarks for dynamical core development for the atmosphere, we conclude that the equation set using total energy E has better well-balancing properties than the set using θ. This result can be confirmed by a normal mode stability analysis. However, in some tests the set using E suffers more from non-linear instabilities that can only partially solved by filtering.
General Comments:
This paper focuses on the implementation details on the development of a standard discontinuous Galerkin (DG) formulation for the compressible Euler equations with gravity for atmospheric applications. Two equation sets are studied: one using density potential temperature (theta) as the thermodynamic variable and another using total energy (energy). A HEVI time-integration strategy is used to evolve the equations forward in time using the method of lines. Various test cases are explored ranging from one-dimensional, to two dimensional, to three dimensional problems on the sphere. This manuscript represents an incredible effort of work and for this I commend the authors. Therefore, it is understandable that with such monumental work, some tasks may not have been covered as well as they would have liked given a finite amount of time.
(i) The takeaway of this paper is a bit vague: they show that for the normal mode analysis, the energy form yields the expected results whereas the theta form does not. This then allows them to justify having to use the STF stabilization for the theta form. However, when they run the baroclinic instability, they find that both equation sets are unstable for this test case but when filtering is added, the theta form is stable for at least 40 days whereas the energy form is not. They mention nonlinear instabilities but no more explanation is given.
(ii) Could you perform an analysis to get to the bottom of this like you did in the normal mode analysis?
(iii) I also find this baroclinic instability test (the Jablonowski-Williamson) as non-standard for a nonhydrostatic model. The Ullrich test case is designed for nonhydrostatic deep and shallow planet equations (Ullrich et al QJRMS 2014). Moreover, this test case should remain stable for a well-constructed atmospheric model.
(iv) I also found the use of filters to stabilize the model as non-standard. Why not use either hyper-diffusion (as most models do) or the same stabilization techniques used in ICON to perform a comparison? The issue with filters is that if they are non-idempotent (this is not discussed in the manuscript) then it is problematic comparing models that require different time-steps as the dissipation is not equal (this is not the case for artificial diffusion strategies).
(v) I also found the discussion difficult to follow with too many details that seem less important while omitting references that could have helped the authors either reduce the amount of text and/or assist them with strategies for analyzing/improving their models.
(vi) It would also be helpful to the reader if they wrote early on what exactly is the novelty of the paper.
(vii) The last point is that the number of degrees of freedom is never mentioned. This is important because it helps to understand exactly how this model compares to other models (e.g., ICON) for the same number of degrees of freedom. Of course it is better to do so for the same error but this requires constructing work-precision diagrams which requires: (1) knowing the analytic solution and (2) running BRIDGES and comparing them to (3) another model (e.g., ICON). It is unclear from just reporting timing whether this is fast or slow.
Below, I will go through the paper sequentially and suggest specific comments that hopefully will improve the paper (L refers to the line number in the manuscript).
Specific Comments
L. 15: “partially solved” should be “be partially solved”. Otherwise, the paper is well-written with few (if any) grammatical errors.
L. 130: why does it matter that it has only linear dependence in the 1st and 3rd terms? It is still nonlinear in the 2nd.
L. 143: I don’t understand why you need a (constant) reference state. Aren’t you using a HEVI method? To me HEVI methods imply that the implicit problem is nonlinear, unless you linearize. Please elaborate on this point as this is confusing and permeates throughout the paper. In Giraldo et al JCP 2024 (which you reference), they show the pros and cons of using fully nonlinear HEVI versus various options for linearization - linearization by a constant reference state is not an option because it is not guaranteed to remain stable given large perturbations, real atmospheric states, etc.
L. 180: to my knowledge using a local reference coordinate on triangular DG elements is first discussed in the context of a spherical shallow water model in Laeuter et al. JCP 2008 (A discontinuous Galerkin method for shallow…).
L. 197: why do you choose shallow planet? Deep planet is straightforward for you to use, is it not? And more general.
L. 210: Section 3.1 is confusing to me. E.g., how is the nonlinear part treated linearly? E.g. in line 248 it is mentioned that “practical simulations reveal” that you only need half of the KE contributions. What do you mean by “practical” and how did you come up with this factor? What happens if you use the complete factor and what savings are you getting by omitting them?
L. 260-265: I'm confused by this discussion. If you are indeed using a nonlinear HEVI method then you need all the terms in order to satisfy the PDE. However, if you are using linear HEVI and only add some of the stiff terms into the implicit linear operator (and have the full nonlinear operator in the explicit part) then your omission of terms makes sense. Please elaborate on whether the HEVI method is linear or nonlinear or better said, whether you first build the full nonlinear operator explicitly and then add/subtract linear terms. See, for example, the HEVI method described in Giraldo et al JCP 2024 (for the nonlinear form) versus the linear form in Giraldo et al SISC 2013.
L. 280: is your choice of how often to construct the linear coefficients in the same vein as what is discussed in Giraldo et al JCP 2024? Also does including the terms that you omit keep your solution stable for longer? Would be an interesting experiment although performing the analysis to see which terms are more important would be better. E.g., you can see which terms have large growth. Can you be sure that the differences in the two equation sets is not due to your linearization choices?
L. 305: Perhaps the crux of my confusion on your time-integration stems from your discussion on linearization of the fluxes, etc. In Reddy et al JCP 2023 on IMEX forms for DG methods, the same equation sets that you study here are analyzed there for atmospheric flows. The focus of that paper is exactly on constructing the numerical fluxes in the right way to handle IMEX DG methods. Might be worth reading this paper. Maybe it’s not relevant to your discussion but I am not sure I understand this section of your paper.
L. 318: You finally write out BR1 here but use BR1 previously. Please define it the first time you use BR1 in the paper.
L. 325-330: Why did you choose SSP time-integrators? Without limiters in the spatial part with DG I’m not sure you will get much benefit. In addition, you may be able to find non-SSP time-integrators with larger stability regions that would then allow your models to likely be faster. Not a major point but might be worth considering.
L. 331: best to explain better. By block tridiagonal you mean that a point on a column is dependent on the element above and below (3 elements) and the full polynomial space in the vertical. Please add the bandwidth to be clear. It should be something like 2N+1 (for a nodal DG method, where N is the degree of the polynomial).
L. 332-335: The discussion in this part of the paper is confusing. I looked at the B21 paper which states that you can use either quadrilateral or triangular nodal bases. In the current paper, you are only using triangles and if they are nodal then you are likely using Proriol-Koornwinder-Dubiner orthogonal polynomials to construct the Lagrange polynomials. To do so, you need interpolation points (usually Fekete points) and then need a separate set of points for integration (Gauss). My confusion is your mention of collocation: there are no set of points that are both good for interpolation and integration on the triangle therefore collocation is not possible on simplices (see, e.g., Helenbrook SINUM 2009), only on tensor-product bases derived from the one-dimensional Lagrange polynomials which permit collocation via Lobatto or Legendre points. Please clarify this section and suggest removing text that does not apply to what you are using in the paper. I also don’t agree that the horizontal directions decouple when using collocation. In DG, the fluxes always couple edge-neighbor elements.
L. 343: I would also add “strong” scaling to a good property of DG methods.
L. 370: stating what is done in collocated methods is a moot point since you are not using these methods.
L. 402: Again, it matters whether you are using modal or nodal bases. If you use collocation in a nodal method then it is diagonal but at the cost of under-integration of the mass matrix. If you use modal, you are only diagonal if the elements are straight-edged/flat. This will not be the case on the sphere as your Jacobians are not constant. So, representing the mass matrix accurately will prevent the mass matrix from being diagonal.
L. 420: See "Adaptive hurricane simulations" for h-refinement in a CG model (Tissaoui et al. JAS 2025). Conceptually, it's no different (perhaps even harder with h-refinement than p-refinement). I don't understand the issue you are describing here.
L. 440: Good point. This is why this type of RK method is preferred.
L.. 455: Not sure this statement is necessary but if you want to include this, you should add that the value of a weak BC is that the BC will be satisfied to the same order of accuracy as the interior solution.
L. 521: Is this reference state constant? This does not work very well for real weather problems. This is the reason why using, e.g., the previous solution is typically the method of choice in HEVI methods. See, e.g., Giraldo et al JCP 2024.
L 550: For completeness, should mention Vandeven's theorem. See, e.g., Vandeven JSC 1991.
L. 571: I agree. Filtering is something we have moved away from. One other problem that you didn't mention is that most filters proposed are not idempotent. Therefore, changing the time-step changes the filtering - unlike using hyperdiffusion which is time-stepped. This, I think, is one of the main issues with filtering.
L. 585: Is there a reference for this case? I don't understand how you use the reference state.
L. 587: do you mean w here? It's 1D in space but not in the equations?
L. 610: Convergence is difficult to see unless you plot the theoretical convergence slopes in the plots.
L. 625: I am surprised about this result. I have not seen this behavior in our models.
L. 643: Why don’t you expect a similar behavior between the solution of the two equation sets?
L. 648: Please state the resulting Courant numbers.
L. 653: Would be good to show a plot of mass conservation for the simulation. Probably best for the longer simulation cases though like the Baroclinic Instability or Held-Suarez.
L. 659: please elaborate on the plotting range comment - I don’t understand.
L. 673: Spatial error will dominate temporal error for a sufficiently small enough time-step regardless of the order of the time-integrator.
L. 676-681: unclear
L. 685: Why use this case? Why not the Ullrich BI test case which is designed for z-coordinate models and can handle nonhydrostatic deep-planet equations?
L. 710: But without filtering it is better and for other tests, you showed that the theta version requires extra stabilization. If you run the Ullrich baroclinic instability with hyperdiffusion it should stay stable forever. See Giraldo et al. JCP 2024 where simulations are shown for over a 100 days. It would be useful to do this and avoid filters.
L. 713: This is not a conservative method. In 10 days you are at 10^-15 mass conservation and at 100 days at 10^-14. Not sure why your model does not conserve mass but this could be a concern. How do you compute your metric terms? Giraldo et al JCP 2024 discuss conservation and metric terms.
L. 715: How many degrees of freedom are you computing in these simulations?
L. 720: What is the time-to-solution and how does this compare to the current ICON model? Why not use hyper-diffusion for stabilization to make more apples-to-apples comparisons with the current operational model (or whatever stabilization ICON uses).
L. 725: what's the Courant number?
L. 735: Perhaps BRIDGES is not expected to be used for climate simulations but isn't this loss in mass a concern?
L. 743: you state that an important goal is to achieve local conservation of the variables. However, I don’t believe you have satisfied this with respect to mass. Is that a concern?
L. 755: Understandable. However, you used a stabilization technique that is non-standard (non-idempotent, etc.). Since you DID need additional stabilization why then not use a standard method (i.e., hyper-diffusion) which is well understood and used by most operational models?
L. 760: The method proposed in Souza et al is kinetic energy preserving - not entropy conserving/dissipating like the work in Waruszewski.
L. 760-764: you ran the Jablonowski-Williamson BI, Waruszewski et al ran the Ullrich BI. These are different cases so not sure I would conclude anything from different tests.
L. 762: Entropy stable methods have already been formulated for triangles by Jesse Chan - likely extendable to triangular prisms. See Jesse Chan, On discretely entropy conservative and entropy stable discontinuous Galerkin methods, Journal of Computational Physics, Volume 362, 2018, Pages 346-37.
L. 764: The work of Montoya et al 2025 won't help much since it uses collocation. You need to read the work by Chan.
L. 785: But is this the true setup you used in the experiments? I don't think so since you mention a horizontal velocity. How then does this apply?
L. 785: This normal model analysis shows that the theta form is unstable whereas the energy form is not. Or rather that the energy form is stable in the stable regime and unstable in the unstable regime. Is this an issue with the PDE, numerics or the linearization? The theta equation set is standard in many models (MPAS-A) and have not heard about such issues before. Which equation set does ICON use?