HOPE: An Arbitrary-Order Non-Oscillatory Finite-Volume Shallow Water Dynamical Core with Automatic Differentiation

Zhou, Lilong; Xue, Wei

doi:10.5194/egusphere-2025-1889

Preprints

https://doi.org/10.5194/egusphere-2025-1889

Preprints

27 May 2025

| 27 May 2025

HOPE: An Arbitrary-Order Non-Oscillatory Finite-Volume Shallow Water Dynamical Core with Automatic Differentiation

Lilong Zhou and Wei Xue

Abstract. This study presents the High Order Prediction Environment (HOPE), an automatically differentiable, non-oscillatory finite-volume dynamical core for shallow water equations on the cubed-sphere grid. HOPE integrates four key features: (1) arbitrary high-order accuracy through genuine two-dimensional reconstruction schemes; (2) essential non-oscillation via adaptive polynomial order reduction in discontinuous regions; (3) exact mass conservation inherited from finite-volume discretization; (4) automatically differentiable and (5) GPU-native scalability through PyTorch-based implementation. Another innovation is the intensive panel boundary treatment, which eliminates numerical instability during using high order reconstruction scheme, meanwhile, simplifies the interpolation process to a matrix-vector multiplication without losing accuracy. Numerical experiments demonstrates the capabilities of HOPE: The 11th-order scheme reduces errors to near double-precision round-off levels in steady-state geostrophic flow tests on coarse 1°×1° grids. Maintenance of Rossby-Haurwitz waves over 100 simulation days without crashing. A cylindrical dam-break test case confirms the genuinely two-dimensional WENO scheme exhibits significantly better isotropy compared to dimension-by-dimension approaches. Two implementations are developed: a Fortran version for convergence analysis and a PyTorch version leveraging automatic differentiation and GPU acceleration. The PyTorch implementation maps reconstruction and quadrature operation to 2D convolution and Einstein summation respectively, achieving about 2× speedup on single NVIDIA RTX3090 GPU versus Dual Intel E5-2699v4 CPUs execution. This design enables seamless coupling with neural network parameterizations, positioning HOPE as a foundational tool for next-generation differentiable atmosphere models.

Received: 23 Apr 2025 – Discussion started: 27 May 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2769 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (2769 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

05 Nov 2025

HOPE: an arbitrary-order non-oscillatory finite-volume shallow water dynamical core with automatic differentiation

Lilong Zhou, Wei Xue, and Xueshun Shen

Geosci. Model Dev., 18, 8175–8201, https://doi.org/10.5194/gmd-18-8175-2025,https://doi.org/10.5194/gmd-18-8175-2025, 2025

Short summary

Lilong Zhou and Wei Xue

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1889', Anonymous Referee #1, 16 Jun 2025
Summary
The manuscript presents a new framework known as HOPE (High Order Prediction Environment) for the numerical solution of the shallow water equations on a cubed sphere grid. HOPE has several novel or interesting features, including options for very high order numerics, options to use inherently 2D WENO schemes, and an implementation in PyTorch that facilitates running on GPUs and provides automatically differentiability. The PyTorch implementation is timely as it facilitates integration of the model within a machine learning system, a topic that is currently of great interest.
I believe these novel features should eventually be sufficient to justify publication after some revision to clarify the presentation and discussion in the manuscript.

Points related to interpretation and understanding
Line 74. It is important to make sure that you compare like with like. A k'th order 1D finite difference derivative requires (generally) a stencil of k+1 points or cells. In HOPE you mostly discuss reconstructions rather than derivatives; a k'th order reconstruction can be done with a stencil of k cells. But if you take a difference of two reconstructions to compute a derivative then you will have used two different stencils and at least k+1 data points.
High order
It is good to see the requirement for smoothness of data mentioned for high order to be more accurate (line 65, 81-82). Advocates of high order schemes don't always mention this.

However, it is important to be precise with terminology to avoid confusion for readers (and authors!) The phrase 'convergence accuracy' (line 69) mixes up two ideas that should be kept distinct: order of accuracy and convergence rate. Convergence rate agrees with order of accuracy only for sufficiently smooth data. It is very common for the convergence rate to be less than the order of accuracy.

Line 77. 'arbitrary accuracy' -> arbitrary order of accuracy. Check for other places where you have used 'accuracy' when you mean 'order of accuracy' (e.g., lines 424, 433, 458, 461, 463). Order of accuracy is not the same thing as accuracy; there are even situations where a higher order of accuracy produces a less accurate solution.

Line 224. It is not obvious that negative values of (an element of) \gamma could cause instability. Presumably the elements of R_H must be allowed to be negative, otherwise we would not be able to achieve more than second order? So why should there be such a restriction on \gamma? Please give some discussion or a reference.

At first glance (41) seems to be dimensionally inconsistent, since it mixes derivatives of different orders. It might be good to remind the reader that the computational coordinates x and y have effectively been non-dimensionalized by the grid spacing so that \Delta x = \Delta y = 1. (Thus, (41) is dimensionally correct, after all.) This non-dimensionalization is also appropriate to ensure that the smoothness indicators \beta_i scale with resolution in an appropriate way. What is \epsilon in (39)?

Section 5, general comment (to the whole community, really!): make sure you extract good, useful information from your test cases, not just attractive-looking plots! For example, see the next point, as well as the suggestion below to diagnose dissipation quantitatively.

Some of the test cases in section 5 are run with the high-order reconstruction and some are run with the WENO scheme, and the flow over the mountain case does not say which scheme is used. In an operational model one must make up one's mind which scheme to use, though, in research mode, having different options available allows one to explore sensitivities. It would be valuable for readers if you could share any knowledge and understanding you have gleaned by comparing WENO vs high order on the different test cases. Even if you don't show figures and tables for all combinations, it would be good to comment on any differences. For example, do WENO3 and WENO5 give 3rd order and 5th order convergence for the steady geostrophic flow case? Do high order schemes produce oscillations in the flow over a mountain case? Does WENO3 produce similar solutions to the third order scheme on the Rossby-Haurwitz test case?

Line 407. 'prone to collapse due to factors such as...'. To be clear, the R=4 Rossby-Haurwitz wave is unstable. Those factors, or even roundoff error, can provide a perturbation that initiates the instability, but they are not the fundamental cause of the collapse - that is the instability itself.
Line 491. That test case is actually dominated by Rossby waves, not gravity waves.

Points for discussion; the authors may or may not wish to address these in the manuscript.

Riemann solver
The Riemann solver is applied at every quadrature point before doing the quadrature. Would there be any advantage in doing the quadrature first and then applying the Riemann solver just once at each interface?

High order
Line 378. Do you have any data on how much more expensive high order is? Of course the answer will depend on implementation, computing platform, resolution, etc, but it would be good to have an idea.

Line 380. I am inclined to agree that 3rd or 5th order will be the practical choice, but I would be interested to know your reasoning.

One very useful potential application of a code like HOPE would be to help answer that question in a quantitative way. Flows of realistic complexity (therefore not very smooth), like the flow over an isolated mountain case, generally don't have exact solutions available. But if you could compute an accurate high-resolution reference solution then you would be able to plot error versus computational cost as you vary both resolution and order of accuracy.

Rossby-Haurwitz wave collapse
Lines 414-418. How long the Rossby-Haurwitz wave is sustained is a measure of how strongly numerical errors project onto the growing mode(s) at early times. After that the instability grows at its own rate until the RH wave collapses. Note that a cubed sphere (in the usual orientation) has an advantage (compared to an icosahedral grid, for example) in that its discretization errors should project onto zonal wavenumber 4 and higher harmonics, whereas zonal wavenumbers 1, 3, and 5 project onto the instability. Thus, presumably, it is roundoff errors that break the wavenumber 4 symmetry and trigger the instability for HOPE(?) If that is the case, then higher precision should delay the collapse. Have you tested that? It sounds like you are set up to be able to do that easily. Conversely, if higher precision does not delay the collapse, then that begs the question: what is breaking the wavenumber 4 symmetry to trigger the instability, and could it be an implementation bug?

Also, it is good to be aware of what the time of collapse is really telling you about the model formulation, and to look at other informative aspects of the solution. For example, you mention apparent 'dissipation' of the solution. You could measure that dissipation quantitiatively by diagnosing a conserved quantity like energy or potential enstrophy, for example, and look at how their conservation depends on resolution and order of accuracy.

Points related to improving the clarity of the explanations

Line 18. '...reduces interpolation to matrix-vector multiplication'. When I first read this I thought it was stating the obvious: interpolation is a linear operation. The significance only became clear when I read section 3.3: even though the panel boundary treatment couples ghost points on the two sides of the boundary, it can be reduced to a straightforward matrix-vector multiplication. Perhaps you can briefly mention this two-way coupling in the abstract.
Abstract line 23. It is unclear why a separate fortran code version is needed for `convergence analysis'. The fortran version is not mentioned in the main text (though the source code is provided).
Line 44. Comment: whether a spectral method conserves mass depends on which variables are chosen to be represented by a spectral expansion. For example, predicting a spectral representation of surface pressure (rather than the more usual log of surface pressure) should conserve mass in a hydrostatic model.
Line 60. At this point it is unclear which Jacobian matrix you mean. Which derivatives are computed automatically? Some more explanation is needed. Similarly on line 320: which gradients can be computed efficiently?
Line 70, also 243. 'does not surpass 7th order'. Please clarify whether you were using the MCORE code or your own implementation of something similar. Also, is this a fundamental limitation of the mathematical formulation or an issue with a particular implementation? It would be good to clarify what is meant by one-sided interpolation. You could avoid ghost points altogether by doing one-sided reconstruction, but I don't think that is what you mean. Do you mean that with one-sided interpolation there is no coupling between ghost points on the two sides of a panel boundary?
Line 71. I can guess what you mean by ghost interpolation scheme, but many readers will need more explanation at this point, or at least a forward reference to where it is discussed in more detail.
Line 84+. The WENO scheme is (or can be) used whenever the model needs to compute a flux. Is that correct? It was not clear to me.
Line 125. It is not yet clear where 'reconstructing' is used in the algorithm, hence this discussion is hard to follow. It would be good to give a brief overview of the method before getting into details. Also, if the reader does not already know what the 'C-property' is then line 125 does not help them. Either explain or omit. It would be worth adding that, although you use \phi_t in the momentum equation, in (13) you still predict \sqrt{G} \phi for mass conservation.
Line 138. It could be worth mentioning that, although LMARS is an approximate Riemann solver, it combines two high-order estimates to obtain the flux, so the result is high order.
Equation (26). A few words of explanation would be helpful. Here we know the \bar{q}_i, since they are predicted by the time stepping, and we wish to determine the coefficients a.
To be clear, do we need a version of the matrix R (31) for every grid cell, or is a single matrix R applicable to all grid cells?
Can you clarify whether the 2D WENO scheme is arbitrary order too, or is the implementation currently limited to 3rd and 5th order?. (The namelist file suggests the latter.)
Lines 225 to 229. This section is confusing: You define \gamma^+ and \gamma^- then jump to expressing q(x,y) in terms of \omega^+ and \omega^-.
Line 238. '...eight panel boundaries...'. Please check!
The scheme for ghost cell interpolation neatly exploits the auto-differentiation capability of PyTorch! What do you do near panel corners? Section 3.3.1: could you please clarify, is the iterative scheme used once at setup to obtain the matrix G, and then matrix multiplication is used subsequently at run time? Presumably there can be lots of zeros in G, since cells near the centre of a panel do not affect any ghost cells; thus, could some compact representation of G be used?
Line 312. Comment: the Wicker-Skamarock RK scheme is 3rd order only for linear problems.
Which scheme was used to produce figure 8? Was it one of the WENO schemes or the arbitrary order (non-WENO) scheme? In either case, what was the order of accuracy? It is encouraging that there are no numerical oscillations in the vorticity or other fields. Is that true for all the schemes discussed, or only for the WENO schemes?

Points related to equations and mathematical notation
Equation (21). r is a dummy subscript in the middle expression; it should not appear in the final expression. Similarly for equation (22).
Line 162-163 does not make sense, since you have not specified any relation between k and n. There are many inconsistencies in notation in this section. n is the number of terms in a polynomial (line 162), then it is the stencil width in the x-direction (23). m is the stencil width in the y-direction (23), then it is equal to n^2 (line 169, 207, 213). k is the width of the stencil (line 163) then a dummy index for coefficients (23). Line 207: the stencil width is now h (but h is not mentioned again). In section 4 the stencil width is s_w.
Inconsistent fonts are used for the matrix \gamma (compare (32) and (35)).
Presumably (36) refers to individual elements of the matrix \gamma, not the entire matrix?
(37) and (38) don't seem to be correct. (38) implies that \sum_i \omega_i^+ = 1 and \sum_i \omega_i^- = 1. However, in order for (37) to be a proper weighted average of the p_i's we would need \sum_i (\omega_i^+ - \omega_i^-) = 1. \omega_i is mentioned in the text, but only \omega_i^+ and \omega_i^- are defined by equations. Should we assume \omega_i = \omega_i^+ - \omega_i^- ? Please check.
Equation (53) seems to be dimensionally inconsistent. Should sign(m) not be abs(m), which would pick out the upwind value of q? See also line 346. Actually, taking careful note of parentheses, the (fortran) source code seems to be correct, but is inconsistent with equation (53).
The notation G is used for the metric (section 2) and also for the matrix to compute ghost cell values (section 3.3.1). Line 262: g should be bold font.
Lines 324 and 327: can I just check that there should be no comma between n_v and n_p, i.e., the first dimension is of size n_v \times n_p? The code (if I understand it correctly) suggests that these arrays are 5-dimensional. Also, comparing lines 327 and 334, n_{poc} seems to be the size of the second (or perhaps third) dimension, not the first.

Points related to phrasing, typos, etc
Line 17, line 53. 'intensive' panel boundary treatment. What is meant by intensive? Perhaps a different word would be better?
Line 34. 'Unlike...' is not a complete sentence. Perhaps the preceding full stop should be a comma?
Line 78. Does 'its' refer to the new ghost interpolation scheme?
Line 92. If I understand correctly, GPU optimization and automatic differentiabilty are two different things; PyTorch happens to provide them both. The sentence as written implies that automatic differentiation is needed for GPU implementation, which I don't think is correct.
Line 133. Can you clarify: Gaussian quadrature along the interface (rather than, say, over some upwind region).
Line 162. The term 'order' is already overloaded. It is not necessary to talk about a k'th-order square stencil. It is enough to say k \times k stencil. See also line 206.
Line 163: n^2 is the number of cells in the stencil ('cell number in the stencil' is ambiguous). Similarly, it is the number of terms in the TPP.
Line 210. The phrase 'determine the unique weights' suggests that (32) can be solved and has a unique solution. As soon becomes clear, (32) is overdetermined and has no exact solution, and only a least squares approximate solution can be found.
Line 227. 'stencil i is smooth ... stencil i is discontinuous...'. Don't you mean the data sampled or reconstructed on stencil i is smooth or discontinuous?
Line 301. 'location, since'. Full stop and a new sentence would be better.
Equation (47). Since Einstein summation is mentioned in various places, perhaps note that there is no summation over i in (47).
Line 310. I cannot find any other mention of H.
Line 318-321. 'Both of these operations are highly optimized for execution on GPUs...' Do you mean highly optimized in the PyTorch implementation? The next sentence seems to be confusing two distinct ideas: (i) PyTorch has built-in commands for convolutions and matrix-vector multiplication, streamlining implementation (without explicit loop commands); (ii) PyTorch offers automatic differentiation, enabling efficient gradient computation.
Line 323. To be clear, n_v prognostic variables per cell.
Line 359: widely?
Line 363. You haven't said what \alpha is, other than a number that is set to zero.
Line 387. The phrase 'we set' makes it seem like you have made your own choice for \lambda_c and \theta_c. But aren't those values the standard ones for this test case?
Line 401. zonal advection -> zonal propagation. (The wave structure is not simply advected in the zonal direction; it propagates through the Rossby wave propagation mechanism.)
Line 404. Please check the units for c.
Line 481. 'handling of anomalous anisotropic characateristics'. I think the problem is that the 1D scheme lacks isotropy, rather than the data.
References: Kochkov et al.; the year should be 2024.
Citation: https://doi.org/10.5194/egusphere-2025-1889-RC1
- AC1: 'Reply on RC1', Lilong Zhou, 21 Jul 2025
  
  We sincerely appreciate your review comments. We have addressed each of the questions and suggestions point-by-point in a comprehensive response. Given the comprehensive nature of this reply, we have compiled it into a PDF file placed in the Supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC1
- AC2: 'Reply on RC1', Lilong Zhou, 26 Jul 2025
  
  In reviewing our prior response to your comments, we identified an error in our earlier explanation that requires correction.
  Line 363, \alpha denotes the rotation angle transcribed between the physical north pole and the top point in model grid (the center of northern panel on the cubed-sphere grid in HOPE)
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC2
- AC3:
  'Reply on RC1 fix error', Lilong Zhou, 31 Jul 2025
  We fix the figure and description errors about the reply for Referee Comment:
  Also, it is good to be aware of what the time of collapse is really telling you about the model formulation, and to look at other informative aspects of the solution. For example, you mention apparent 'dissipation' of the solution. You could measure that dissipation quantitiatively by diagnosing a conserved quantity like energy or potential enstrophy, for example, and look at how their conservation depends on resolution and order of accuracy.
  
  We put the new reply in the Supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC3
RC2:
'Comment on egusphere-2025-1889', Anonymous Referee #2, 25 Jun 2025
Review Comments on “HOPE: An Arbitrary-Order Non-Oscillatory Finite-Volume Shallow Water Dynamical Core with Automatic Differentiation”
This manuscript presents a numerical framework for atmospheric modeling based on a high-order finite-volume scheme on a cubed-sphere grid. A shallow water model has been implemented and validated through several standard test cases. The numerical results are promising. Although shock-capturing schemes are not yet standard in atmospheric dynamical cores, they may prove useful in future high-resolution applications. Furthermore, the implementation of a new software architecture, which facilitates coupling with AI components, is a valuable contribution. I recommend publication of this manuscript in GMD, subject to the following revisions:
Line 50: The model uses a finite-volume scheme. Please explain why it is characterized as a local-stencil-based model.

Line 52: It should be explicitly stated that the attractive property discussed here arises from the WENO scheme.

Figure 4: Panel (b) shows the spatial stencil for a quadratic polynomial, and panel (c) for a quartic polynomial. Please correct the caption.

Figure 7: The ghost cells are interpolated using a two-dimensional procedure, which involves solving a system of equations iteratively. I recommend the authors consider employing a one-dimensional interpolation scheme instead, as the quadrature points in ghost cells are arranged along lines connecting the corresponding points in neighboring cells. One-dimensional interpolation can simplify the interpolation and improve efficiency.

Subsection 3.5: As the model is based on a WENO scheme, I recommend using a TVD Runge-Kutta time integration.

I suggest including results for solid rotation of a cosine bell along different directions. Please also provide time histories of normalized errors.

Williamson test case 2: It would also be helpful to present results obtained using the corresponding linear scheme (i.e., by applying optimal weights in WENO schemes directly). Displaying the absolute error distributions will be helpful to evaluate the grid imprinting.

Williamson test cases 5 and 6: I recommend reporting the time histories of normalized errors of total energy and potential enstrophy.

Genuine 2D scheme: The manuscript emphasizes the benefits of using a genuine two-dimensional discretization. The benefits should be demonstrated through Williamson’s standard test suite, rather than a dam-break problem, which is not representative of global atmospheric dynamics. Additionally, please quantify the computational cost differences between the dimension-by-dimension and genuinely 2D schemes.
Citation: https://doi.org/10.5194/egusphere-2025-1889-RC2
- AC4: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC4
- AC5: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC5
- AC6: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC6
- AC7: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC7

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1889', Anonymous Referee #1, 16 Jun 2025
Summary
The manuscript presents a new framework known as HOPE (High Order Prediction Environment) for the numerical solution of the shallow water equations on a cubed sphere grid. HOPE has several novel or interesting features, including options for very high order numerics, options to use inherently 2D WENO schemes, and an implementation in PyTorch that facilitates running on GPUs and provides automatically differentiability. The PyTorch implementation is timely as it facilitates integration of the model within a machine learning system, a topic that is currently of great interest.
I believe these novel features should eventually be sufficient to justify publication after some revision to clarify the presentation and discussion in the manuscript.

Points related to interpretation and understanding
Line 74. It is important to make sure that you compare like with like. A k'th order 1D finite difference derivative requires (generally) a stencil of k+1 points or cells. In HOPE you mostly discuss reconstructions rather than derivatives; a k'th order reconstruction can be done with a stencil of k cells. But if you take a difference of two reconstructions to compute a derivative then you will have used two different stencils and at least k+1 data points.
High order
It is good to see the requirement for smoothness of data mentioned for high order to be more accurate (line 65, 81-82). Advocates of high order schemes don't always mention this.

However, it is important to be precise with terminology to avoid confusion for readers (and authors!) The phrase 'convergence accuracy' (line 69) mixes up two ideas that should be kept distinct: order of accuracy and convergence rate. Convergence rate agrees with order of accuracy only for sufficiently smooth data. It is very common for the convergence rate to be less than the order of accuracy.

Line 77. 'arbitrary accuracy' -> arbitrary order of accuracy. Check for other places where you have used 'accuracy' when you mean 'order of accuracy' (e.g., lines 424, 433, 458, 461, 463). Order of accuracy is not the same thing as accuracy; there are even situations where a higher order of accuracy produces a less accurate solution.

Line 224. It is not obvious that negative values of (an element of) \gamma could cause instability. Presumably the elements of R_H must be allowed to be negative, otherwise we would not be able to achieve more than second order? So why should there be such a restriction on \gamma? Please give some discussion or a reference.

At first glance (41) seems to be dimensionally inconsistent, since it mixes derivatives of different orders. It might be good to remind the reader that the computational coordinates x and y have effectively been non-dimensionalized by the grid spacing so that \Delta x = \Delta y = 1. (Thus, (41) is dimensionally correct, after all.) This non-dimensionalization is also appropriate to ensure that the smoothness indicators \beta_i scale with resolution in an appropriate way. What is \epsilon in (39)?

Section 5, general comment (to the whole community, really!): make sure you extract good, useful information from your test cases, not just attractive-looking plots! For example, see the next point, as well as the suggestion below to diagnose dissipation quantitatively.

Some of the test cases in section 5 are run with the high-order reconstruction and some are run with the WENO scheme, and the flow over the mountain case does not say which scheme is used. In an operational model one must make up one's mind which scheme to use, though, in research mode, having different options available allows one to explore sensitivities. It would be valuable for readers if you could share any knowledge and understanding you have gleaned by comparing WENO vs high order on the different test cases. Even if you don't show figures and tables for all combinations, it would be good to comment on any differences. For example, do WENO3 and WENO5 give 3rd order and 5th order convergence for the steady geostrophic flow case? Do high order schemes produce oscillations in the flow over a mountain case? Does WENO3 produce similar solutions to the third order scheme on the Rossby-Haurwitz test case?

Line 407. 'prone to collapse due to factors such as...'. To be clear, the R=4 Rossby-Haurwitz wave is unstable. Those factors, or even roundoff error, can provide a perturbation that initiates the instability, but they are not the fundamental cause of the collapse - that is the instability itself.
Line 491. That test case is actually dominated by Rossby waves, not gravity waves.

Points for discussion; the authors may or may not wish to address these in the manuscript.

Riemann solver
The Riemann solver is applied at every quadrature point before doing the quadrature. Would there be any advantage in doing the quadrature first and then applying the Riemann solver just once at each interface?

High order
Line 378. Do you have any data on how much more expensive high order is? Of course the answer will depend on implementation, computing platform, resolution, etc, but it would be good to have an idea.

Line 380. I am inclined to agree that 3rd or 5th order will be the practical choice, but I would be interested to know your reasoning.

One very useful potential application of a code like HOPE would be to help answer that question in a quantitative way. Flows of realistic complexity (therefore not very smooth), like the flow over an isolated mountain case, generally don't have exact solutions available. But if you could compute an accurate high-resolution reference solution then you would be able to plot error versus computational cost as you vary both resolution and order of accuracy.

Rossby-Haurwitz wave collapse
Lines 414-418. How long the Rossby-Haurwitz wave is sustained is a measure of how strongly numerical errors project onto the growing mode(s) at early times. After that the instability grows at its own rate until the RH wave collapses. Note that a cubed sphere (in the usual orientation) has an advantage (compared to an icosahedral grid, for example) in that its discretization errors should project onto zonal wavenumber 4 and higher harmonics, whereas zonal wavenumbers 1, 3, and 5 project onto the instability. Thus, presumably, it is roundoff errors that break the wavenumber 4 symmetry and trigger the instability for HOPE(?) If that is the case, then higher precision should delay the collapse. Have you tested that? It sounds like you are set up to be able to do that easily. Conversely, if higher precision does not delay the collapse, then that begs the question: what is breaking the wavenumber 4 symmetry to trigger the instability, and could it be an implementation bug?

Also, it is good to be aware of what the time of collapse is really telling you about the model formulation, and to look at other informative aspects of the solution. For example, you mention apparent 'dissipation' of the solution. You could measure that dissipation quantitiatively by diagnosing a conserved quantity like energy or potential enstrophy, for example, and look at how their conservation depends on resolution and order of accuracy.

Points related to improving the clarity of the explanations

Line 18. '...reduces interpolation to matrix-vector multiplication'. When I first read this I thought it was stating the obvious: interpolation is a linear operation. The significance only became clear when I read section 3.3: even though the panel boundary treatment couples ghost points on the two sides of the boundary, it can be reduced to a straightforward matrix-vector multiplication. Perhaps you can briefly mention this two-way coupling in the abstract.
Abstract line 23. It is unclear why a separate fortran code version is needed for `convergence analysis'. The fortran version is not mentioned in the main text (though the source code is provided).
Line 44. Comment: whether a spectral method conserves mass depends on which variables are chosen to be represented by a spectral expansion. For example, predicting a spectral representation of surface pressure (rather than the more usual log of surface pressure) should conserve mass in a hydrostatic model.
Line 60. At this point it is unclear which Jacobian matrix you mean. Which derivatives are computed automatically? Some more explanation is needed. Similarly on line 320: which gradients can be computed efficiently?
Line 70, also 243. 'does not surpass 7th order'. Please clarify whether you were using the MCORE code or your own implementation of something similar. Also, is this a fundamental limitation of the mathematical formulation or an issue with a particular implementation? It would be good to clarify what is meant by one-sided interpolation. You could avoid ghost points altogether by doing one-sided reconstruction, but I don't think that is what you mean. Do you mean that with one-sided interpolation there is no coupling between ghost points on the two sides of a panel boundary?
Line 71. I can guess what you mean by ghost interpolation scheme, but many readers will need more explanation at this point, or at least a forward reference to where it is discussed in more detail.
Line 84+. The WENO scheme is (or can be) used whenever the model needs to compute a flux. Is that correct? It was not clear to me.
Line 125. It is not yet clear where 'reconstructing' is used in the algorithm, hence this discussion is hard to follow. It would be good to give a brief overview of the method before getting into details. Also, if the reader does not already know what the 'C-property' is then line 125 does not help them. Either explain or omit. It would be worth adding that, although you use \phi_t in the momentum equation, in (13) you still predict \sqrt{G} \phi for mass conservation.
Line 138. It could be worth mentioning that, although LMARS is an approximate Riemann solver, it combines two high-order estimates to obtain the flux, so the result is high order.
Equation (26). A few words of explanation would be helpful. Here we know the \bar{q}_i, since they are predicted by the time stepping, and we wish to determine the coefficients a.
To be clear, do we need a version of the matrix R (31) for every grid cell, or is a single matrix R applicable to all grid cells?
Can you clarify whether the 2D WENO scheme is arbitrary order too, or is the implementation currently limited to 3rd and 5th order?. (The namelist file suggests the latter.)
Lines 225 to 229. This section is confusing: You define \gamma^+ and \gamma^- then jump to expressing q(x,y) in terms of \omega^+ and \omega^-.
Line 238. '...eight panel boundaries...'. Please check!
The scheme for ghost cell interpolation neatly exploits the auto-differentiation capability of PyTorch! What do you do near panel corners? Section 3.3.1: could you please clarify, is the iterative scheme used once at setup to obtain the matrix G, and then matrix multiplication is used subsequently at run time? Presumably there can be lots of zeros in G, since cells near the centre of a panel do not affect any ghost cells; thus, could some compact representation of G be used?
Line 312. Comment: the Wicker-Skamarock RK scheme is 3rd order only for linear problems.
Which scheme was used to produce figure 8? Was it one of the WENO schemes or the arbitrary order (non-WENO) scheme? In either case, what was the order of accuracy? It is encouraging that there are no numerical oscillations in the vorticity or other fields. Is that true for all the schemes discussed, or only for the WENO schemes?

Points related to equations and mathematical notation
Equation (21). r is a dummy subscript in the middle expression; it should not appear in the final expression. Similarly for equation (22).
Line 162-163 does not make sense, since you have not specified any relation between k and n. There are many inconsistencies in notation in this section. n is the number of terms in a polynomial (line 162), then it is the stencil width in the x-direction (23). m is the stencil width in the y-direction (23), then it is equal to n^2 (line 169, 207, 213). k is the width of the stencil (line 163) then a dummy index for coefficients (23). Line 207: the stencil width is now h (but h is not mentioned again). In section 4 the stencil width is s_w.
Inconsistent fonts are used for the matrix \gamma (compare (32) and (35)).
Presumably (36) refers to individual elements of the matrix \gamma, not the entire matrix?
(37) and (38) don't seem to be correct. (38) implies that \sum_i \omega_i^+ = 1 and \sum_i \omega_i^- = 1. However, in order for (37) to be a proper weighted average of the p_i's we would need \sum_i (\omega_i^+ - \omega_i^-) = 1. \omega_i is mentioned in the text, but only \omega_i^+ and \omega_i^- are defined by equations. Should we assume \omega_i = \omega_i^+ - \omega_i^- ? Please check.
Equation (53) seems to be dimensionally inconsistent. Should sign(m) not be abs(m), which would pick out the upwind value of q? See also line 346. Actually, taking careful note of parentheses, the (fortran) source code seems to be correct, but is inconsistent with equation (53).
The notation G is used for the metric (section 2) and also for the matrix to compute ghost cell values (section 3.3.1). Line 262: g should be bold font.
Lines 324 and 327: can I just check that there should be no comma between n_v and n_p, i.e., the first dimension is of size n_v \times n_p? The code (if I understand it correctly) suggests that these arrays are 5-dimensional. Also, comparing lines 327 and 334, n_{poc} seems to be the size of the second (or perhaps third) dimension, not the first.

Points related to phrasing, typos, etc
Line 17, line 53. 'intensive' panel boundary treatment. What is meant by intensive? Perhaps a different word would be better?
Line 34. 'Unlike...' is not a complete sentence. Perhaps the preceding full stop should be a comma?
Line 78. Does 'its' refer to the new ghost interpolation scheme?
Line 92. If I understand correctly, GPU optimization and automatic differentiabilty are two different things; PyTorch happens to provide them both. The sentence as written implies that automatic differentiation is needed for GPU implementation, which I don't think is correct.
Line 133. Can you clarify: Gaussian quadrature along the interface (rather than, say, over some upwind region).
Line 162. The term 'order' is already overloaded. It is not necessary to talk about a k'th-order square stencil. It is enough to say k \times k stencil. See also line 206.
Line 163: n^2 is the number of cells in the stencil ('cell number in the stencil' is ambiguous). Similarly, it is the number of terms in the TPP.
Line 210. The phrase 'determine the unique weights' suggests that (32) can be solved and has a unique solution. As soon becomes clear, (32) is overdetermined and has no exact solution, and only a least squares approximate solution can be found.
Line 227. 'stencil i is smooth ... stencil i is discontinuous...'. Don't you mean the data sampled or reconstructed on stencil i is smooth or discontinuous?
Line 301. 'location, since'. Full stop and a new sentence would be better.
Equation (47). Since Einstein summation is mentioned in various places, perhaps note that there is no summation over i in (47).
Line 310. I cannot find any other mention of H.
Line 318-321. 'Both of these operations are highly optimized for execution on GPUs...' Do you mean highly optimized in the PyTorch implementation? The next sentence seems to be confusing two distinct ideas: (i) PyTorch has built-in commands for convolutions and matrix-vector multiplication, streamlining implementation (without explicit loop commands); (ii) PyTorch offers automatic differentiation, enabling efficient gradient computation.
Line 323. To be clear, n_v prognostic variables per cell.
Line 359: widely?
Line 363. You haven't said what \alpha is, other than a number that is set to zero.
Line 387. The phrase 'we set' makes it seem like you have made your own choice for \lambda_c and \theta_c. But aren't those values the standard ones for this test case?
Line 401. zonal advection -> zonal propagation. (The wave structure is not simply advected in the zonal direction; it propagates through the Rossby wave propagation mechanism.)
Line 404. Please check the units for c.
Line 481. 'handling of anomalous anisotropic characateristics'. I think the problem is that the 1D scheme lacks isotropy, rather than the data.
References: Kochkov et al.; the year should be 2024.
Citation: https://doi.org/10.5194/egusphere-2025-1889-RC1
- AC1: 'Reply on RC1', Lilong Zhou, 21 Jul 2025
  
  We sincerely appreciate your review comments. We have addressed each of the questions and suggestions point-by-point in a comprehensive response. Given the comprehensive nature of this reply, we have compiled it into a PDF file placed in the Supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC1
- AC2: 'Reply on RC1', Lilong Zhou, 26 Jul 2025
  
  In reviewing our prior response to your comments, we identified an error in our earlier explanation that requires correction.
  Line 363, \alpha denotes the rotation angle transcribed between the physical north pole and the top point in model grid (the center of northern panel on the cubed-sphere grid in HOPE)
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC2
- AC3:
  'Reply on RC1 fix error', Lilong Zhou, 31 Jul 2025
  We fix the figure and description errors about the reply for Referee Comment:
  Also, it is good to be aware of what the time of collapse is really telling you about the model formulation, and to look at other informative aspects of the solution. For example, you mention apparent 'dissipation' of the solution. You could measure that dissipation quantitiatively by diagnosing a conserved quantity like energy or potential enstrophy, for example, and look at how their conservation depends on resolution and order of accuracy.
  
  We put the new reply in the Supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC3
RC2:
'Comment on egusphere-2025-1889', Anonymous Referee #2, 25 Jun 2025
Review Comments on “HOPE: An Arbitrary-Order Non-Oscillatory Finite-Volume Shallow Water Dynamical Core with Automatic Differentiation”
This manuscript presents a numerical framework for atmospheric modeling based on a high-order finite-volume scheme on a cubed-sphere grid. A shallow water model has been implemented and validated through several standard test cases. The numerical results are promising. Although shock-capturing schemes are not yet standard in atmospheric dynamical cores, they may prove useful in future high-resolution applications. Furthermore, the implementation of a new software architecture, which facilitates coupling with AI components, is a valuable contribution. I recommend publication of this manuscript in GMD, subject to the following revisions:
Line 50: The model uses a finite-volume scheme. Please explain why it is characterized as a local-stencil-based model.

Line 52: It should be explicitly stated that the attractive property discussed here arises from the WENO scheme.

Figure 4: Panel (b) shows the spatial stencil for a quadratic polynomial, and panel (c) for a quartic polynomial. Please correct the caption.

Figure 7: The ghost cells are interpolated using a two-dimensional procedure, which involves solving a system of equations iteratively. I recommend the authors consider employing a one-dimensional interpolation scheme instead, as the quadrature points in ghost cells are arranged along lines connecting the corresponding points in neighboring cells. One-dimensional interpolation can simplify the interpolation and improve efficiency.

Subsection 3.5: As the model is based on a WENO scheme, I recommend using a TVD Runge-Kutta time integration.

I suggest including results for solid rotation of a cosine bell along different directions. Please also provide time histories of normalized errors.

Williamson test case 2: It would also be helpful to present results obtained using the corresponding linear scheme (i.e., by applying optimal weights in WENO schemes directly). Displaying the absolute error distributions will be helpful to evaluate the grid imprinting.

Williamson test cases 5 and 6: I recommend reporting the time histories of normalized errors of total energy and potential enstrophy.

Genuine 2D scheme: The manuscript emphasizes the benefits of using a genuine two-dimensional discretization. The benefits should be demonstrated through Williamson’s standard test suite, rather than a dam-break problem, which is not representative of global atmospheric dynamics. Additionally, please quantify the computational cost differences between the dimension-by-dimension and genuinely 2D schemes.
Citation: https://doi.org/10.5194/egusphere-2025-1889-RC2
- AC4: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC4
- AC5: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC5
- AC6: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC6
- AC7: 'Reply on RC2', Lilong Zhou, 31 Jul 2025
  
  We sincerely appreciate your questions and suggestions. Based on the issues you identified, we have implemented significant revisions to the manuscript. We commend your scientific rigor throughout this process.
  The reply is too long to post as pure text, we put it as PDF in the supplement file.
  Please note that an incorrect PDF file was uploaded in the previous version; this version of the reply should be considered authoritative.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1889-AC7

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Lilong Zhou on behalf of the Authors (17 Aug 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (23 Aug 2025) by Yongze Song

RR by Anonymous Referee #1 (30 Aug 2025)

Suggestions for revision or reasons for rejection

I thank the authors for addressing my comments on the original manuscript so thoroughly, including addressing comments that were "at the authors' discretion".

Most of my remaining comments quite minor (typos, etc), except the first one.

Line numbers refer to the tracked changes version.

1. For the zonal flow over an isolated mountain test, the true solution will not conserve angular momentum because of the mountain. Therefore, it is not correct to interpret changes in angular momentum as 'errors' or 'dissipation'. I suggest that you simply remove that diagnostic and related discussion from section 5.3. The true solution for the Rossby-Haurwitz wave test should conserve angular momentum, so it is reasonable to include that diagnostic in section 5.4.

2. The abstract (second sentence) mentions four key features, but then lists five!

3. In numerous places the manuscript mentions 'geopotential height'. Apparently, sometimes you mean geopotential \phi, sometimes you mean height h, and sometimes it is not clear which you mean. I suggest you stick to the terms 'geopotential' when you mean \phi and 'height' when you mean h; 'geopotential height' is really only useful in 3D to mean \phi / g , which can be different from geometric height, depending on approximations made. See lines 138, 143, 161, 422, 710, 711, table 1, and captions of figures 10, 11, 13, and 15.

4. Line 196: I think F_r should be F_{m_e}. Line 200: I think S_r should be S_{m_e}.

5. On line 147 r is earth's radius; on line 514 a is earth's radius and r is something else.

6. What resolution was used in Figure 10?

7. Table 1: Which Riemann solver was used?

8. Equation (83) and line 633: The equation implies that the units of c should be s^{-1}. On line 633 it appears that
c has been expressed in units of days^{-1}; however, the stated units are days.

Hide

RR by Anonymous Referee #2 (01 Sep 2025)

ED: Publish subject to minor revisions (review by editor) (05 Sep 2025) by Yongze Song

AR by Lilong Zhou on behalf of the Authors (11 Sep 2025) Author's response Author's tracked changes Manuscript

ED: Publish subject to minor revisions (review by editor) (15 Sep 2025) by Yongze Song

AR by Lilong Zhou on behalf of the Authors (17 Sep 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (22 Sep 2025) by Yongze Song

AR by Lilong Zhou on behalf of the Authors (23 Sep 2025)

Journal article(s) based on this preprint

05 Nov 2025

HOPE: an arbitrary-order non-oscillatory finite-volume shallow water dynamical core with automatic differentiation

Lilong Zhou, Wei Xue, and Xueshun Shen

Geosci. Model Dev., 18, 8175–8201, https://doi.org/10.5194/gmd-18-8175-2025,https://doi.org/10.5194/gmd-18-8175-2025, 2025

Short summary

Lilong Zhou and Wei Xue

Model code and software

HOPE: High Order Predition Environment Lilong Zhou https://gitee.com/DwyaneChou/FVM/tree/Pytorch/

Lilong Zhou and Wei Xue

Viewed

Total article views: 6,018 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
5,383	522	113	6,018	85	149

HTML: 5,383
PDF: 522
XML: 113
Total: 6,018
BibTeX: 85
EndNote: 149

Views and downloads (calculated since 27 May 2025)

Month	HTML	PDF	XML	Total
May 2025	102	18	4	124
Jun 2025	118	32	18	168
Jul 2025	108	32	12	152
Aug 2025	896	46	20	962
Sep 2025	3,468	12	4	3,484
Oct 2025	152	20	6	178
Nov 2025	74	18	6	98
Dec 2025	62	44	4	110
Jan 2026	74	48	8	130
Feb 2026	88	82	4	174
Mar 2026	104	106	10	220
Apr 2026	73	33	9	115
May 2026	48	20	3	71
Jun 2026	10	5	2	17
Jul 2026	6	6	3	15

Cumulative views and downloads (calculated since 27 May 2025)

Month	HTML	PDF	XML	Total
May 2025	102	18	4	124
Jun 2025	118	32	18	168
Jul 2025	108	32	12	152
Aug 2025	896	46	20	962
Sep 2025	3,468	12	4	3,484
Oct 2025	152	20	6	178
Nov 2025	74	18	6	98
Dec 2025	62	44	4	110
Jan 2026	74	48	8	130
Feb 2026	88	82	4	174
Mar 2026	104	106	10	220
Apr 2026	73	33	9	115
May 2026	48	20	3	71
Jun 2026	10	5	2	17
Jul 2026	6	6	3	15

Viewed (geographical distribution)

Total article views: 6,011 (including HTML, PDF, and XML) Thereof 6,011 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Jul 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (2769 KB)
Metadata XML

Short summary

This study develops a novel physics-based weather prediction model using artificial intelligence development platforms, achieving high accuracy while maintaining strict physical conservation laws. Our algorithms are optimized for modern super computers, enabling efficient large-scale weather simulations. A key innovation is the model's inherent differentiable nature, allowing seamless integration with AI systems to enhance predictive capabilities through machine learning techniques.


Total:	0
HTML:	0
PDF:	0
XML:	0