the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The Newton solver with step size control is faster than the Picard iteration in simulating ice flow (FEniCSfullStokes v1.1.0)
Abstract. Solving the momentum balance is the computationally expensive part of simulating the evolution of ice sheets. The momentum balance is described by the nonlinear fullStokes equations. As a nonlinear problem, they are solved iteratively. We solve these equations with Newton's method. We obtain global superlinear convergence by using a step size control. For the step size control, we need a minimization problem. Solving the fullStokes equations is equivalent to minimizing a specific convex function. We use the Armijo and the exact step sizes for Newton's method. Additionally, we use the exact step sizes for the Picard iteration. Finally, we compare the Picard iteration and the variants of Newton's method in two benchmark experiments, called ISMIPHOM experiments A and B. These experiments consist of a more realistic domain and are designed to test the quality of ice models. We obtain that Newton's method and the Picard iteration with exact step sizes greatly reduce the necessary number of iterations.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(1229 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1229 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20231569', Anonymous Referee #1, 27 Sep 2023
Summary: The paper implements a Newton method with stepsize control for solving a full Stokes model describing the dynamics of glacial ice. In idealised diagnostic numerical experiments two types of step size control are compared with each other and to Picard iterations with and without stepsize control.
General comments:
The chosen topic is important and the paper is well structured and seemingly without any obvious technical mistakes. My main concern is about the novelty of this paper, for the following reasons:
 The Newton method is faster than Picard is expected and Newtons method is already used in ice sheet models extensively, even with step size control (e.g. in Elmer/Ice). It is true that Newton iterations do not converge for all cases especially without running a few initial Picard iterations, but for the numerical experiments chosen in this paper my guess is that Newton works in most existing codes already. Perhaps exact step size is new but I think in that case a brief summary of the most common step size control methods in the most used ice sheet models should be provided. Furthermore, a significant superiority of exact stepsizes compared to Armijo is not clear from this paper.
 In comparison to the previous manuscript which is referred to in this paper: Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf In Schmidt, 2013 more mathematical details and proves are given, but the method and experiments seems identical. Is Schmidt, 2013 indented to be published?
Nevertheless, I think the paper can be published after major revisions. My suggestions for addressing my above concerns are:
 If Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf is not already submitted somewhere, I think the mathematical proofs here are novel enough for publication. I am not sure if the Cryosphere is the best journal for this, but possibly the mathematical proofs could be explained and summarized in the main text, and the proofs themselves be kept in the appendix.
 Otherwise, a solution which probably fits the Cryosphere better, is to extend the numerical experiments and literature study so that the importance to glaciology is clearer. If the authors can show that it is possible to use Newtons method in realistic simulations in an effective way without initial Picard iterations, I think it is a major contribution to the community. Specifically the following points should be included:
 In the introduction, go through the most important ice sheet models (especially those using the full Stokes model) and review what method they are using to resolve the nonlinearity of the problem
 Extend the numerical experiments. If the aim of this paper is to show that some method is better than another and could be useful for an ice sheet modeller, it must be for a relevant example. Most importantly, a timedependancy (free surface movement) should be introduced. This is important since the first timestep (the only one studied here) usually behaves quite differently in terms of number of iterations, as compared to later timesteps when free surface has relaxed. In a real application, there is also a contact problem between the ice and bedrock, which introduces an extra nonlinearity. The more realistic example (3D, variable bedrock) the better. Also, I think CPUtimes should be measured rather than only number of iterations.
If the original paper Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf is kept as a separate paper, I think still some of the mathematical details must be added to this paper. I am lacking some mathematical justification of some of the key points of the paper, for instance the justification of the extra diffusion term in the equations, and explanation behind Algorithm 3.
Specific comments:
The abstract and introduction could flow better. The sentences could be longer or better connected.
L 4 and 5: I think these sentences can be removed from the abstract “For the step size control, we need a minimization problem. Minimizing a specific convex function is equivalent to solving the fullStokes equations.”
L 18: nonlinear > nonlinearly?
L 18: make it clearer that the stationary variation of these equations is the standard one
L 56: These spaces are for linear Stokes, see e.g. the work by Belenki et al: https://www.jstor.org/stable/41582741 for the appropriate spaces
L 60: Why do you add a diffusion term? This is nonstandard and not included in other ice sheet models, which might make the reader wonder if these results are applicable to their model. I suggest trying to get rid of this term.
L 71: I would not say that Picard is standard. Include some references for ice sheet models that use Picard rather than Newton. Most probably use a combination of the two.
L 9293: It’s true that in Hirn \mu_0 is 0 (and even delta=0 !) . This should be discussed  why is it not possible to set them to zero in this work?
Algorithm 3: what is the stopping criteria, i.e. how many forloop iterations do you do?
L 104: For the reader of the cryosphere, I think you should introduce the two stepsize control methods with a bit more words, discuss pros and cons etc, rather than just stating the algorithm.
Equation 1011: Does the fact that the terms are smaller really mean that one can surely say that the solution is not impacted significantly, especially in a timedependent simulation it is not clear to me.
Figure 2: I think it is misrepresentative to compare to models that are not full stokes, please only include full stokes models for the dashed lines.
About the experiments: I strongly recommend that you add a more complex example, something with more complex geometry and timedependency. A timedependent simulation of the arolla glacier maybe.
About the experiments: Add plots with relative error vs CPUtime, to show if the stepsize control increases the work per iteration significantly or not
The outlook: This is a little bit informally written.
Citation: https://doi.org/10.5194/egusphere20231569RC1 
AC1: 'Reply on RC1', Niko Schmidt, 06 Feb 2024
Thanks for taking the time to review the manuscript. We think your suggestions and ideas greatly improve the quality of the manuscript. In the introduction, we added references to ice sheet models. Moreover, we added timedependent experiments of the glacier d'Arolla with and without sliding. We also performed all experiments without the diffusion term as suggested and measured computation times for some of the experiments.
For more details, please see the attached document.

EC1: 'Comment on egusphere20231569', Ludovic Räss, 05 Dec 2023
Dear authors,
Thank you for submitting your manuscript to GMD. Given the challenges we had to find suitable reviewers for your work, I decided to provide a more substantial Topical Editor comment to replace the missing second review.
The paper investigates solutions to the Stokes equations with applications to ice flow modelling. The Stokes equations are discretised using the finiteelement method using FENICS. Two solvers implement the Picard and the Newton method, respectively. The authors apply their numerical solver to community benchmarks and discuss convergence rates as function of different step size selection strategies.
Although being an interesting and highly relevant topic, the presented work is insufficient in several aspects, mostly in terms of relevance and quality standards within GMD. In summary, the work seems to be a stripped down version of an already published preprint, where formal mathematical proofs were removed (https://arxiv.org/abs/2307.02930). Also, the work lacks in providing significant geoscientific context with respect to the applications. From the succinct text, it is difficult to grasp the novelty of the proposed work, given that the fact that Newton is faster than Picard is exactly what one would expect for the pStokes problem. Although it is a nonlinear problem, the pStokes equations are the optimality conditions of a strictly convex functional. As a result, one would expect Newton's method to work very well.
The presented results are also somewhat strange. The poor convergence of Newton solvers in most of the cases are suspicious and could reflect an implementation issue in the algorithm. It would also be interesting to see on more iteration at which "relative difference" (or error) the Picard solver stalls. Such stalls can in most cases originate from either poor scaling of the error, or an issue with the code. In general, the convergence rate of Newton's method should outperform the Picard rate, which does not seem to be the case here in most cases. Further, the Armijo method should work in a robust way for convex problems like this and it is strange to see such poor convergence.
Moreover, very little importance seems to be assigned further exploring these nonexpected results.
Based on this situation, and given the review from the first referee, significant improvement needs to be done to the current manuscript in order to be receivable in GMD. I would suggest, besides following all suggestions and addressing all critics from the first reviewer, you implement the following modifications:
 Extend the introduction to add context, comparison with existing solution strategies in other ice flow models, discuss limitations and better place your work in the glaciological modelling framework.
 Check your code implementation as the results you report look suspicious. There is still a bug that prevents quadratic convergence of Newton's solver.
 Provide further details about the implementation and performance of the solvers. The convergence plots are for sure interesting, but are not the only results to report from your study.
 Better motivate your various choices at all stages in the manuscript, providing some additional and relevant references in some fields.
 Provide a much more "indepth" analysis of your results. If no change is observed after checking the code, it may be interesting to compare the behaviour of the solvers on other traditional benchmarks, such as viscous inclusion setup or others.Finally, it would be valuable to know your position regarding the preprint from 2023 which is very similar to this paper and may have already been submitted to a more mathoriented journal.
Taking the time and making the effort to carefully revisit and substantially extend the current work may provide a valuable input for the geoscientific modelling community and could be suited for GMD. However, in the current state, the work seems closer to a rushed submission than a complete paper.
Citation: https://doi.org/10.5194/egusphere20231569EC1 
AC2: 'Reply on EC1', Niko Schmidt, 06 Feb 2024
We thank you for taking the time to review the manuscript as the topical editor and suggesting ideas to improve the quality of the manuscript.
Next to addressing all concerns from the first reviewer, we addressed your concerns. For details, please see the attached document.
We especially discussed the convergence problem of Newton's method with Armijo step sizes by simulating the experiment ISMIPHOM B with different resolutions: For higher resolutions, the relative difference reduces more. We also added a plot in the response with higher values of δ as this leads to the expected quadratic convergence and, dependent on δ, does not change the solution too much.

AC2: 'Reply on EC1', Niko Schmidt, 06 Feb 2024

RC2: 'Comment on egusphere20231569', Ludovic Räss, 05 Dec 2023
Dear authors,
Please find the my "Referee Comments" as Editor Comments in the interactive discussion, due to the challenges we encountered in finding a suitable second reviewer.
Best, Ludovic Räss  Topical Editor
Citation: https://doi.org/10.5194/egusphere20231569RC2
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20231569', Anonymous Referee #1, 27 Sep 2023
Summary: The paper implements a Newton method with stepsize control for solving a full Stokes model describing the dynamics of glacial ice. In idealised diagnostic numerical experiments two types of step size control are compared with each other and to Picard iterations with and without stepsize control.
General comments:
The chosen topic is important and the paper is well structured and seemingly without any obvious technical mistakes. My main concern is about the novelty of this paper, for the following reasons:
 The Newton method is faster than Picard is expected and Newtons method is already used in ice sheet models extensively, even with step size control (e.g. in Elmer/Ice). It is true that Newton iterations do not converge for all cases especially without running a few initial Picard iterations, but for the numerical experiments chosen in this paper my guess is that Newton works in most existing codes already. Perhaps exact step size is new but I think in that case a brief summary of the most common step size control methods in the most used ice sheet models should be provided. Furthermore, a significant superiority of exact stepsizes compared to Armijo is not clear from this paper.
 In comparison to the previous manuscript which is referred to in this paper: Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf In Schmidt, 2013 more mathematical details and proves are given, but the method and experiments seems identical. Is Schmidt, 2013 indented to be published?
Nevertheless, I think the paper can be published after major revisions. My suggestions for addressing my above concerns are:
 If Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf is not already submitted somewhere, I think the mathematical proofs here are novel enough for publication. I am not sure if the Cryosphere is the best journal for this, but possibly the mathematical proofs could be explained and summarized in the main text, and the proofs themselves be kept in the appendix.
 Otherwise, a solution which probably fits the Cryosphere better, is to extend the numerical experiments and literature study so that the importance to glaciology is clearer. If the authors can show that it is possible to use Newtons method in realistic simulations in an effective way without initial Picard iterations, I think it is a major contribution to the community. Specifically the following points should be included:
 In the introduction, go through the most important ice sheet models (especially those using the full Stokes model) and review what method they are using to resolve the nonlinearity of the problem
 Extend the numerical experiments. If the aim of this paper is to show that some method is better than another and could be useful for an ice sheet modeller, it must be for a relevant example. Most importantly, a timedependancy (free surface movement) should be introduced. This is important since the first timestep (the only one studied here) usually behaves quite differently in terms of number of iterations, as compared to later timesteps when free surface has relaxed. In a real application, there is also a contact problem between the ice and bedrock, which introduces an extra nonlinearity. The more realistic example (3D, variable bedrock) the better. Also, I think CPUtimes should be measured rather than only number of iterations.
If the original paper Schmidt, 2013 https://arxiv.org/pdf/2307.02930.pdf is kept as a separate paper, I think still some of the mathematical details must be added to this paper. I am lacking some mathematical justification of some of the key points of the paper, for instance the justification of the extra diffusion term in the equations, and explanation behind Algorithm 3.
Specific comments:
The abstract and introduction could flow better. The sentences could be longer or better connected.
L 4 and 5: I think these sentences can be removed from the abstract “For the step size control, we need a minimization problem. Minimizing a specific convex function is equivalent to solving the fullStokes equations.”
L 18: nonlinear > nonlinearly?
L 18: make it clearer that the stationary variation of these equations is the standard one
L 56: These spaces are for linear Stokes, see e.g. the work by Belenki et al: https://www.jstor.org/stable/41582741 for the appropriate spaces
L 60: Why do you add a diffusion term? This is nonstandard and not included in other ice sheet models, which might make the reader wonder if these results are applicable to their model. I suggest trying to get rid of this term.
L 71: I would not say that Picard is standard. Include some references for ice sheet models that use Picard rather than Newton. Most probably use a combination of the two.
L 9293: It’s true that in Hirn \mu_0 is 0 (and even delta=0 !) . This should be discussed  why is it not possible to set them to zero in this work?
Algorithm 3: what is the stopping criteria, i.e. how many forloop iterations do you do?
L 104: For the reader of the cryosphere, I think you should introduce the two stepsize control methods with a bit more words, discuss pros and cons etc, rather than just stating the algorithm.
Equation 1011: Does the fact that the terms are smaller really mean that one can surely say that the solution is not impacted significantly, especially in a timedependent simulation it is not clear to me.
Figure 2: I think it is misrepresentative to compare to models that are not full stokes, please only include full stokes models for the dashed lines.
About the experiments: I strongly recommend that you add a more complex example, something with more complex geometry and timedependency. A timedependent simulation of the arolla glacier maybe.
About the experiments: Add plots with relative error vs CPUtime, to show if the stepsize control increases the work per iteration significantly or not
The outlook: This is a little bit informally written.
Citation: https://doi.org/10.5194/egusphere20231569RC1 
AC1: 'Reply on RC1', Niko Schmidt, 06 Feb 2024
Thanks for taking the time to review the manuscript. We think your suggestions and ideas greatly improve the quality of the manuscript. In the introduction, we added references to ice sheet models. Moreover, we added timedependent experiments of the glacier d'Arolla with and without sliding. We also performed all experiments without the diffusion term as suggested and measured computation times for some of the experiments.
For more details, please see the attached document.

EC1: 'Comment on egusphere20231569', Ludovic Räss, 05 Dec 2023
Dear authors,
Thank you for submitting your manuscript to GMD. Given the challenges we had to find suitable reviewers for your work, I decided to provide a more substantial Topical Editor comment to replace the missing second review.
The paper investigates solutions to the Stokes equations with applications to ice flow modelling. The Stokes equations are discretised using the finiteelement method using FENICS. Two solvers implement the Picard and the Newton method, respectively. The authors apply their numerical solver to community benchmarks and discuss convergence rates as function of different step size selection strategies.
Although being an interesting and highly relevant topic, the presented work is insufficient in several aspects, mostly in terms of relevance and quality standards within GMD. In summary, the work seems to be a stripped down version of an already published preprint, where formal mathematical proofs were removed (https://arxiv.org/abs/2307.02930). Also, the work lacks in providing significant geoscientific context with respect to the applications. From the succinct text, it is difficult to grasp the novelty of the proposed work, given that the fact that Newton is faster than Picard is exactly what one would expect for the pStokes problem. Although it is a nonlinear problem, the pStokes equations are the optimality conditions of a strictly convex functional. As a result, one would expect Newton's method to work very well.
The presented results are also somewhat strange. The poor convergence of Newton solvers in most of the cases are suspicious and could reflect an implementation issue in the algorithm. It would also be interesting to see on more iteration at which "relative difference" (or error) the Picard solver stalls. Such stalls can in most cases originate from either poor scaling of the error, or an issue with the code. In general, the convergence rate of Newton's method should outperform the Picard rate, which does not seem to be the case here in most cases. Further, the Armijo method should work in a robust way for convex problems like this and it is strange to see such poor convergence.
Moreover, very little importance seems to be assigned further exploring these nonexpected results.
Based on this situation, and given the review from the first referee, significant improvement needs to be done to the current manuscript in order to be receivable in GMD. I would suggest, besides following all suggestions and addressing all critics from the first reviewer, you implement the following modifications:
 Extend the introduction to add context, comparison with existing solution strategies in other ice flow models, discuss limitations and better place your work in the glaciological modelling framework.
 Check your code implementation as the results you report look suspicious. There is still a bug that prevents quadratic convergence of Newton's solver.
 Provide further details about the implementation and performance of the solvers. The convergence plots are for sure interesting, but are not the only results to report from your study.
 Better motivate your various choices at all stages in the manuscript, providing some additional and relevant references in some fields.
 Provide a much more "indepth" analysis of your results. If no change is observed after checking the code, it may be interesting to compare the behaviour of the solvers on other traditional benchmarks, such as viscous inclusion setup or others.Finally, it would be valuable to know your position regarding the preprint from 2023 which is very similar to this paper and may have already been submitted to a more mathoriented journal.
Taking the time and making the effort to carefully revisit and substantially extend the current work may provide a valuable input for the geoscientific modelling community and could be suited for GMD. However, in the current state, the work seems closer to a rushed submission than a complete paper.
Citation: https://doi.org/10.5194/egusphere20231569EC1 
AC2: 'Reply on EC1', Niko Schmidt, 06 Feb 2024
We thank you for taking the time to review the manuscript as the topical editor and suggesting ideas to improve the quality of the manuscript.
Next to addressing all concerns from the first reviewer, we addressed your concerns. For details, please see the attached document.
We especially discussed the convergence problem of Newton's method with Armijo step sizes by simulating the experiment ISMIPHOM B with different resolutions: For higher resolutions, the relative difference reduces more. We also added a plot in the response with higher values of δ as this leads to the expected quadratic convergence and, dependent on δ, does not change the solution too much.

AC2: 'Reply on EC1', Niko Schmidt, 06 Feb 2024

RC2: 'Comment on egusphere20231569', Ludovic Räss, 05 Dec 2023
Dear authors,
Please find the my "Referee Comments" as Editor Comments in the interactive discussion, due to the challenges we encountered in finding a suitable second reviewer.
Best, Ludovic Räss  Topical Editor
Citation: https://doi.org/10.5194/egusphere20231569RC2
Peer review completion
Postreview adjustments
Journal article(s) based on this preprint
Viewed
HTML  XML  Total  BibTeX  EndNote  

353  113  36  502  24  23 
 HTML: 353
 PDF: 113
 XML: 36
 Total: 502
 BibTeX: 24
 EndNote: 23
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Angelika Humbert
Thomas Slawig
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1229 KB)  Metadata XML