the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Automatic adjoint-based inversion schemes for geodynamics: Reconstructing the evolution of Earth’s mantle in space and time
Abstract. Reconstructing the thermo-chemical evolution of Earth's mantle and its diverse surface manifestations is a widely-recognised grand challenge for the geosciences. It requires the creation of a digital twin: a digital representation of Earth's mantle across space and time that is compatible with available observational constraints on the mantle's structure, dynamics and evolution. This has led geodynamicists to explore adjoint-based approaches that reformulate mantle convection modelling as an inverse problem, in which unknown model parameters can be optimised to fit available observational data. Whilst recent years have seen a notable increase in the use of adjoint-based methods in geodynamics, the theoretical and practical challenges of deriving, implementing and validating adjoint systems for large-scale, non-linear, time-dependent problems, such as global mantle flow, has hindered their broader use. Here, we present the Geoscientific Adjoint Optimisation Platform (G-ADOPT), an advanced computational modelling framework that overcomes these challenges for coupled, non-linear, time-dependent systems. By integrating three main components: (i) Firedrake, an automated system for the solution of partial differential equations using the finite element method; (ii) Dolfin-Adjoint, which automatically generates discrete adjoint models in a form compatible with Firedrake; and (iii) the Rapid Optimisation Library, ROL, an efficient large-scale optimisation toolkit; G-ADOPT enables the application of adjoint methods across geophysical continua, showcased herein for geodynamics. Through two sets of synthetic experiments, we demonstrate application of this framework to the initial condition problem of mantle convection, in both square and annular geometries, for both isoviscous and non-linear rheologies. We confirm the validity of the gradient computations underpinning the adjoint approach, for all cases, through second-order Taylor remainder convergence tests, and subsequently demonstrate excellent recovery of the unknown initial conditions. Moreover, we show that the framework achieves theoretical computational efficiency. Taken together, this confirms the suitability of G-ADOPT for reconstructing the evolution of Earth's mantle in space and time. The framework overcomes the significant theoretical and practical challenges of generating adjoint models, and will allow the community to move from idealised forward models to data-driven simulations that rigorously account for observational constraints and their uncertainties using an inverse approach.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(11218 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(11218 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2683', Georg Reuber, 04 Dec 2023
Review: Automatic adjoint-based inversion schemes for geodynamics: Reconstructing the evolution of Earth’s mantle in space and time
By Sia Ghelichkhan, Angus Gibson, D. Rhodri Davies, Stephan C. Kramer, and David A. Ham
The authors present an open-source mantle convection inversion modelling library based on Firedrake, ROL and dolfin-adjoint. The library makes use of Firedrake’s high level of abstraction to discretize the forward problem and automatically derive the adjoint equation and derivative expressions via UFL. A connection to ROL allows for usage of scipy optimizers. They solve 2 toy inverse problems while investigating the effects of their regularization choice. The contribution is valuable to the community as their library might facilitate prototyping of mantle convection related inverse problems. The choice of regularization terms can be adapted for future (mantle) reconstructions.
The paper is very well written and mostly well structured. Especially the description of the numerical implications of DTO versus OTD is well formulated.
I recommend publication with minor modifications.
Major comments:
Collect all mathematical symbols in Table 1.
A comparison to existing automatic adjoint frameworks for PDEs, e.g. FEniCSx adjoint, TAO based PETSc adjoint, Julia AdFEM, etc might be valuable. On one hand it remains unclear what parts the library are abstracted away from a potential user in contrast to directly using Firedrake, FEniCS, PETSc etc. On the other hand, it is unclear what the boundaries on the flexibility of the library are. Such a discussion on the library would facilitate potential users’ choice to use the author’s library for their inverse problem at hand. Some questions that remained unclear to me are:
- How to include and load data – is it limited to the data that the authors present toy inversions for?
- does changing any rheology require function overloading or is the weak form and constitutive laws an input?
- What types of BCs are pre-implemented?
- Are other time discretizations pre-implemented or does a potential user have to provide it?
- Particle advection included and in what form?
- How to configure the actual solver (multigrid etc?)
- Visualization capabilities
- Support for complicated initial meshes, e.g. via gmesh interface
- …
Even if these questions are answered in previous work of the authors, it would be helpful to mention included functionality and potentially cite the work again.
Also, as the code is open source, the code screenshots as well as their detailed description could then be moved to an Appendix.
447 & equation 8. As you point out, this will also act as a regularizer on the reconstruction. Can you elaborate what (types of) reconstructions this will affect?
The analysis on finding the optimal scalings for the regularization, as well as velocity term respectively, is interesting (even though generalization of it might be limited). As the authors put some focus on the effect of the regularization, visualizing the reconstructions of their scaling parameter search to show the effect of the regularization can be interesting. This could also be done in an Appendix.
Even though the code is open source and potential users could do performance tests themselves it would greatly facilitate the choice to use the author’s library if they would provide runtime performance results, as well as more general information, like:
- Wall clock time of their simulations
- Used hardware
- Integration points per element
- Parallelization capabilities (e.g. matrix free solvers for forward Newton and inverse Newton, full PETSc MPI support?)
- Functionality of FEniCSx and PETSc incorporated and abstracted to what level? (e.g. solver availability? E.g. all multigrid solvers from PETSc configurable, if geometric multigrid how to discretize etc?)
- …
The data types used in the toy inversions are explained in the discussion section. To facilitate reading one could move this part to a separate section, potentially before presenting the inversions. I agree with the authors that a substantial discussion of the used data types is beyond the scope of this work, which should mainly present their library. Nevertheless, a discussion of availability and quality of (current) global (on each grid point) temperature data might be helpful, as it might be strongly priored by a previous model.
Minor comments:
Note that there is no guarantee that automatic differentiation, even on the symbolic level, leads to the maximum possible efficiency, e.g. derivations by hand might, in complex systems, still find more efficient substitutions.
Equation 7:
- 7a u_obs should be a vector (bold). Also, squaring u_obs probably means taking the inner product here?
- Explain further your choice to normalize against equation 7a, or temperature, and not using a standard scaling for all terms
91: Sentence is hard to understand.
168: ther – their
430: should the velocity vector u be bold?
Figure 3 is missing letters in the subfigures
Figure 6: Missing characters in the text boxes on the left. Typo: Inital -> Initial
738: Missing space
Â
Sincerely Yours,
Georg Reuber
Citation: https://doi.org/10.5194/egusphere-2023-2683-RC1 -
RC2: 'Comment on egusphere-2023-2683', Anonymous Referee #2, 02 Jan 2024
I enjoyed reviewing this manuscript, which explores the performance of adjoint mantle convection models aimed at initial condition recovery in the context of a powerful new computational modeling framework, i.e. the Geoscientific Adjoint Optimisation Platform (G- ADOPT ).
The MS is well written, the figures are effective, the results are well presented and should be of broad interest to the geodynamic modeling community. To say it upfront: I strongly urge publication as is.
Three advances make the MS particularly noteworthy.
1) The authors introduce the Geoscientific Adjoint Optimisation Platform (G- ADOPT) as an efficient computational modelling platform. They also present a number of impressive computational and accuracy performance measures. This will be helpful as a wider geodynamic user community wishes to adopt adjoint based methods for their work.
2) The authors introduce the effects of damping and smoothing to the inverse problem.
3) The authors introduce a surface velocity misfit into the objective functional. The spatial extend of the associated kernel differs from the thermal misfit. Importantly, the measure injects misfit information throughout the simulation period. This makes the new measure complementary to the mere use of the final thermal misfit information applied in earlier adjoint models. I regard this as an important step forward.
(in my copy of the MS, the labels of Figure3 A-D are not printed, this needs to be fixed)
I have three minor comments:
1) The lower misfit reduction in the non-linear experiment relative to the isoviscous experiment is attributed both to the higher Rayleigh number and to the extended simulation period. The latter appears to approach a transit time, although this seemingly is not made explicit. However, a higher Rayleigh number should help in the adjoint performance, as unrecoverable effects from thermal diffusion are reduced. So it would seem that the longer simulation period is the more relevant factor when explaining the lower misfit reduction.
2) Surface velocity misfits are helpful for initial condition recovery if the surface velocities exclusively represent the influence of mantle flow. There are observations (e.g. Late Miocene slow down of South American plate velocity) that this may not be the case.
3) The authors rightfully acknowledge the inverse crime, i.e. their experiments neglect effects from uncertainty in the use of the forward model and the relevant observations. The inverse crime effects in geodynamic adjoint models where studied explicitly in Colli etal. 2020 Â and that study could be cited.
Citation: https://doi.org/10.5194/egusphere-2023-2683-RC2 - EC1: 'Comment on egusphere-2023-2683', Boris Kaus, 13 Jan 2024
-
CC1: 'Comment on egusphere-2023-2683', Nicolas Coltice, 15 Jan 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2683/egusphere-2023-2683-CC1-supplement.pdf
- AC1: 'Response to all reviewer and community comments', Siavash Ghelichkhan, 30 Mar 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2683', Georg Reuber, 04 Dec 2023
Review: Automatic adjoint-based inversion schemes for geodynamics: Reconstructing the evolution of Earth’s mantle in space and time
By Sia Ghelichkhan, Angus Gibson, D. Rhodri Davies, Stephan C. Kramer, and David A. Ham
The authors present an open-source mantle convection inversion modelling library based on Firedrake, ROL and dolfin-adjoint. The library makes use of Firedrake’s high level of abstraction to discretize the forward problem and automatically derive the adjoint equation and derivative expressions via UFL. A connection to ROL allows for usage of scipy optimizers. They solve 2 toy inverse problems while investigating the effects of their regularization choice. The contribution is valuable to the community as their library might facilitate prototyping of mantle convection related inverse problems. The choice of regularization terms can be adapted for future (mantle) reconstructions.
The paper is very well written and mostly well structured. Especially the description of the numerical implications of DTO versus OTD is well formulated.
I recommend publication with minor modifications.
Major comments:
Collect all mathematical symbols in Table 1.
A comparison to existing automatic adjoint frameworks for PDEs, e.g. FEniCSx adjoint, TAO based PETSc adjoint, Julia AdFEM, etc might be valuable. On one hand it remains unclear what parts the library are abstracted away from a potential user in contrast to directly using Firedrake, FEniCS, PETSc etc. On the other hand, it is unclear what the boundaries on the flexibility of the library are. Such a discussion on the library would facilitate potential users’ choice to use the author’s library for their inverse problem at hand. Some questions that remained unclear to me are:
- How to include and load data – is it limited to the data that the authors present toy inversions for?
- does changing any rheology require function overloading or is the weak form and constitutive laws an input?
- What types of BCs are pre-implemented?
- Are other time discretizations pre-implemented or does a potential user have to provide it?
- Particle advection included and in what form?
- How to configure the actual solver (multigrid etc?)
- Visualization capabilities
- Support for complicated initial meshes, e.g. via gmesh interface
- …
Even if these questions are answered in previous work of the authors, it would be helpful to mention included functionality and potentially cite the work again.
Also, as the code is open source, the code screenshots as well as their detailed description could then be moved to an Appendix.
447 & equation 8. As you point out, this will also act as a regularizer on the reconstruction. Can you elaborate what (types of) reconstructions this will affect?
The analysis on finding the optimal scalings for the regularization, as well as velocity term respectively, is interesting (even though generalization of it might be limited). As the authors put some focus on the effect of the regularization, visualizing the reconstructions of their scaling parameter search to show the effect of the regularization can be interesting. This could also be done in an Appendix.
Even though the code is open source and potential users could do performance tests themselves it would greatly facilitate the choice to use the author’s library if they would provide runtime performance results, as well as more general information, like:
- Wall clock time of their simulations
- Used hardware
- Integration points per element
- Parallelization capabilities (e.g. matrix free solvers for forward Newton and inverse Newton, full PETSc MPI support?)
- Functionality of FEniCSx and PETSc incorporated and abstracted to what level? (e.g. solver availability? E.g. all multigrid solvers from PETSc configurable, if geometric multigrid how to discretize etc?)
- …
The data types used in the toy inversions are explained in the discussion section. To facilitate reading one could move this part to a separate section, potentially before presenting the inversions. I agree with the authors that a substantial discussion of the used data types is beyond the scope of this work, which should mainly present their library. Nevertheless, a discussion of availability and quality of (current) global (on each grid point) temperature data might be helpful, as it might be strongly priored by a previous model.
Minor comments:
Note that there is no guarantee that automatic differentiation, even on the symbolic level, leads to the maximum possible efficiency, e.g. derivations by hand might, in complex systems, still find more efficient substitutions.
Equation 7:
- 7a u_obs should be a vector (bold). Also, squaring u_obs probably means taking the inner product here?
- Explain further your choice to normalize against equation 7a, or temperature, and not using a standard scaling for all terms
91: Sentence is hard to understand.
168: ther – their
430: should the velocity vector u be bold?
Figure 3 is missing letters in the subfigures
Figure 6: Missing characters in the text boxes on the left. Typo: Inital -> Initial
738: Missing space
Â
Sincerely Yours,
Georg Reuber
Citation: https://doi.org/10.5194/egusphere-2023-2683-RC1 -
RC2: 'Comment on egusphere-2023-2683', Anonymous Referee #2, 02 Jan 2024
I enjoyed reviewing this manuscript, which explores the performance of adjoint mantle convection models aimed at initial condition recovery in the context of a powerful new computational modeling framework, i.e. the Geoscientific Adjoint Optimisation Platform (G- ADOPT ).
The MS is well written, the figures are effective, the results are well presented and should be of broad interest to the geodynamic modeling community. To say it upfront: I strongly urge publication as is.
Three advances make the MS particularly noteworthy.
1) The authors introduce the Geoscientific Adjoint Optimisation Platform (G- ADOPT) as an efficient computational modelling platform. They also present a number of impressive computational and accuracy performance measures. This will be helpful as a wider geodynamic user community wishes to adopt adjoint based methods for their work.
2) The authors introduce the effects of damping and smoothing to the inverse problem.
3) The authors introduce a surface velocity misfit into the objective functional. The spatial extend of the associated kernel differs from the thermal misfit. Importantly, the measure injects misfit information throughout the simulation period. This makes the new measure complementary to the mere use of the final thermal misfit information applied in earlier adjoint models. I regard this as an important step forward.
(in my copy of the MS, the labels of Figure3 A-D are not printed, this needs to be fixed)
I have three minor comments:
1) The lower misfit reduction in the non-linear experiment relative to the isoviscous experiment is attributed both to the higher Rayleigh number and to the extended simulation period. The latter appears to approach a transit time, although this seemingly is not made explicit. However, a higher Rayleigh number should help in the adjoint performance, as unrecoverable effects from thermal diffusion are reduced. So it would seem that the longer simulation period is the more relevant factor when explaining the lower misfit reduction.
2) Surface velocity misfits are helpful for initial condition recovery if the surface velocities exclusively represent the influence of mantle flow. There are observations (e.g. Late Miocene slow down of South American plate velocity) that this may not be the case.
3) The authors rightfully acknowledge the inverse crime, i.e. their experiments neglect effects from uncertainty in the use of the forward model and the relevant observations. The inverse crime effects in geodynamic adjoint models where studied explicitly in Colli etal. 2020 Â and that study could be cited.
Citation: https://doi.org/10.5194/egusphere-2023-2683-RC2 - EC1: 'Comment on egusphere-2023-2683', Boris Kaus, 13 Jan 2024
-
CC1: 'Comment on egusphere-2023-2683', Nicolas Coltice, 15 Jan 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2683/egusphere-2023-2683-CC1-supplement.pdf
- AC1: 'Response to all reviewer and community comments', Siavash Ghelichkhan, 30 Mar 2024
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
328 | 176 | 33 | 537 | 23 | 20 |
- HTML: 328
- PDF: 176
- XML: 33
- Total: 537
- BibTeX: 23
- EndNote: 20
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Siavash Ghelichkhan
Angus Gibson
D. Rhodri Davies
Stephan C. Kramer
David A. Ham
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(11218 KB) - Metadata XML