the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
EnviFlux (v1.0): a simplified surface flux inversion tool based on four-dimensional variational data assimilation (4D-Var)
Abstract. This paper introduces EnviFlux, which is a software tool (written in C++) to study the inverse problem of estimating the distribution of fluxes of a trace gas at the Earth's surface. This is done using a four-dimensional variational technique, which combines information from (i) synthetic observations of the trace gas spanning a defined time window, (ii) a transport model linking the initial concentration and surface fluxes to predictions of the observations, and (iii) a-priori information. The novelty of this system – compared to those that attempt to solve for real data – is in its relative simplicity and low cost, allowing new ideas to be assessed and understood quickly and cheaply. This is in line with many other developments in data assimilation that are often first explored using so-called "toy" systems. Part of this paper is to document this system, which is sufficiently complex to allow assimilation of in situ and total column amount (TCA) observation in arbitrary configurations, and has a flexible background error covariance model, but is simple enough to be run relatively quickly.
Another part of this paper uses EnviFlux to explore the effect of model error and observation bias on inferred surface fluxes in two example scenarios and with two observation types, namely surface in situ (SIS) and TCA. The first is a highly idealised case with a flux pair (a localised source and a localised sink), where no a-priori information concerning their positions is provided. It is found that model errors in the assimilation can severely affect the inferred flux positions and amounts, but observation biases do not affect the positions, but do affect the amounts. The second scenario is closer to a real-world example of methane flux estimates where an a-priori is refined with observations. The effect of model error is less than for the first scenario, but is still evident, and the observation biases affect the flux amounts. In both cases, the SIS observations allow a more accurate estimate of surface flux characteristics than the TCA observations, even though the experiments are setup to allow a fair comparison of the two.
The last part of the paper raises some research questions that EnviFlux could help address, and the appendices describe the particular background error covariance scheme used.
- Preprint
(2018 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CEC1: 'Comment on egusphere-2025-6352 - No compliance with the policy of the journal', Juan Antonio Añel, 28 Mar 2026
-
AC1: 'Reply on CEC1', Ross Bannister, 20 Apr 2026
Dear Editor,
Thank you for alerting me to these journal requirements.
I believe I have addressed the issues raised. I will e-mail the handling editor to find out how I can upload a revised manuscript. Since the review process appears to have started, it is not clear to me how to update the manuscript and the links to the code and data.
Thanks and kind wishes, Ross Bannister
Citation: https://doi.org/10.5194/egusphere-2025-6352-AC1 -
CEC3: 'Reply on AC1', Juan Antonio Añel, 20 Apr 2026
Dear author,
Thanks for your reply. You can not upload a new version of your manuscript at the moment, and before a full round of reviews has ended. To solve the pending issues regarding the compliance with the code and data policy, you must reply to this comment with the revised version of the text for the "Code and Data policy" for your manuscript. Next, we will assess if it complies with the requirements, and will provide you feedback here in Discussions.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-6352-CEC3 -
AC2: 'Reply on CEC3', Ross Bannister, 20 Apr 2026
Dear Prof. Añel,
Many thanks for your quick reply. Below is my proposed text for the code and data availability for the manuscript. I hope this is sufficient to satisfy the journal's requirements.
Thanks and kind wishes,
Ross Bannister
Code and data availability. The EnviFlux v1.0 user guide and code are available from doi.org/10.5281/zenodo.18803399 (Bannister, 2026b)
under the Creative Commons Attribution 4.0 International licence, and the data used to produce the results in this paper are available from
doi.org/10.5281/zenodo.19630028 (Bannister, 2026a).
ReferencesBannister, R. N.: Data files for the manuscript "EnviFlux (v1.0): a simplified surface flux inversion tool based on four-dimensional variational
data assimilation (4D-Var)", https://doi.org/10.5281/zenodo.19630028, 2026a.Bannister, R. N.: EnviFlux techincal and user guide, and code respository, https://doi.org/10.5281/zenodo.18803399, 2026b.
Citation: https://doi.org/10.5194/egusphere-2025-6352-AC2 -
CEC4: 'Reply on AC2', Juan Antonio Añel, 20 Apr 2026
Dear author,
Many thanks for the reply. We can consider now the current version of your manuscript in compliance with the Code and Data Policy of the journal. Please, do not forget to include the new information in any potentially reviewed version of your manuscript.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-6352-CEC4
-
CEC4: 'Reply on AC2', Juan Antonio Añel, 20 Apr 2026
-
AC2: 'Reply on CEC3', Ross Bannister, 20 Apr 2026
-
CEC3: 'Reply on AC1', Juan Antonio Añel, 20 Apr 2026
-
AC1: 'Reply on CEC1', Ross Bannister, 20 Apr 2026
-
CEC2: 'Remainder on egusphere-2025-6352 - No compliance with the code and data policy', Juan Antonio Añel, 15 Apr 2026
Dear authors,
I want to insist that you must address the issues pointed out in my previous comment on compliance with the code and data policy of the journal. Otherwise we can not continue with the review process for your manuscript.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-6352-CEC2 -
RC1: 'Comment on egusphere-2025-6352', Anonymous Referee #1, 21 Apr 2026
The author presents a useful “toy model” data assimilation environment for flux estimation, complementing a long history of such toy models in the dynamical literature. The model description is appropriate and the presented case studies are interesting, and I recommend publication after the following comments are addressed. My comments are from my perspective as a practitioner of emissions flux inversions (i.e., an enthusiastic potential user of EnviFlux), and are primarily about the extent to which the full range of problems arising in my work could be investigated with this tool.
What is the computational performance of EnviFlux, in terms that are meaningful for a typical practitioner? For example, what would it take to infer daily fluxes at roughly 2x2.5 degree resolution for a year? How many cores? How much wall time, memory, etc?
Clarify the spatial domain(s) supported by EnviFlux (e.g., does it have to be global). Regional simulations with boundary conditions imposed by a global flux inversion are a very important case in, for example, methane, and introduce huge questions in error quantification (e.g., to what extent do we alias boundary condition error onto in-domain emissions estimates); it would be enormously useful to investigate this with a low-cost tool like EnviFlux.
What are the spatial and temporal resolutions of the case studies presented, and what are the range of possible resolutions?
How are wind fields specified? Could users prescribe assimilated wind fields?
The semi-Lagrangian approach makes simulating chemistry tricky, making this tool (at least at global resolutions) most useful for long-lived species. However, for problems like methane and N2O, it is necessary to account for 3D atmospheric sink terms due to tropospheric OH and stratospheric loss processes respectively. I would be very interested in exploring the sensitivity of inversion approaches to characterizing this sort of loss field. Could EnviFlux be adapted to such problems?
Section 3.2: observational error correlation has proven to be a big problem in my own satellite-based data assimilation. Shared surface characteristics (e.g. mountains) make for systematic errors that are irreducible due to averaging. How hard would it be to specify more complex observational error distributions with off-diagonal terms?
Section 3.3: How easy would it be to trade out this background matrix for other approaches that might better approximate the varied operational systems used by potential EnviFlux users?
Lines 411-414: This is a very interesting discrepancy. In my very rough experience, satellite data is more “reliable” because of the representation error of surface sites, and because satellite data is generally less sensitive dilution in a mischaracterized PBL (the convective transport explanation of Basu et al. is also plausible). Or at least, this is the way I have thought about it without rigorously investigating the issue. To what extent could these sorts of errors in convection/PBLH be studied with EnviFlux?
Citation: https://doi.org/10.5194/egusphere-2025-6352-RC1 - AC3: 'Reply on RC1', Ross Bannister, 14 May 2026
-
RC2: 'Comment on egusphere-2025-6352', Anonymous Referee #2, 19 May 2026
In this study, the author has developed a simple flux inversion analysis tool, named as EnviFlux, for estimating surface fluxes of atmospheric constituents. It is no doubt that this kind of tool is useful, because a flux inversion tool that is used practically for inverse analysis is computationally expensive. It is sometimes difficult to use a practically used system for several sensitivity studies, though those sensitivity experiments are needed to evaluate reliability of inversion results. I have found this simple “toy model” scientifically sound, however I cannot recommend this manuscript for publication because of two major reasons described below.
1. No information of computational efficiency or useful facility as a tool
Given the information described in this manuscript, this model seems not to have any advantage other than simplicity compared to other existing models (e.g., lacking turbulent and cloud convection processes, not assuring tracer masses). However, the most important things, how simple and easy to use as a tool this model is, are not well described. How fast this model can run should be explicitly discussed compared with existing models. I do not think just using the semi-Lagrangian scheme cannot make the model faster than others as this technique is often used in existing models. In addition, not implementing turbulent or cumulus convection scheme may make the model fast, but it does not provide novelty of the study, because other models can do the same thing by just turning off those processes.
2. Too simple experimental design
Usually, during development of this kind of inversion analysis model, one should perform ideal experiments to see whether the model appropriately works or not. In that sense, the experiments the author showed in this manuscript are not novel. Furthermore, scaling vertical wind speeds, the author applied to see the effects of model errors, is too simplistic and the results may not provide scientifically useful information. If the atmosphere is considered as hydrostatic (this assumption is usually valid), horizontal wind velocities should accordingly be corrected when vertical velocity is changed. Just changing vertical velocities would deteriorate the consistency with the continuity equation and induce considerable errors in mass conservation. It means that unexpected sources or sinks would occur in the atmosphere, which is not suitable for simulations of long-lived species such as CH4. In addition, the author did not consider atmospheric sink of CH4 due to oxidation with OH, which is one of the most important features of atmospheric CH4.
Minor comments:
L25-26: CO2, CH4, N2O; because they appear first, they should be fully spelled-out.
L34: That is true also for 4D-Var here.
L42: That is true also for CO here.
L76: Please specify what χ is here (e.g., mass concentration, mole fraction, mixing ratio).
L77: Please elaborate how to derive wind velocities and diffusion coefficients. Are they taken from reanalysis data?
L78: “u = (u v w)” should be a transpose of a vector, i.e. “u = (u v w)T”
L82: “ppb” should be explained as “parts per billion”. In addition, this is not usually used as the unit of mixing ratio, but as that of dry air mole fraction.
L86: What does “τρ” mean?
L110: “For the experiments shown in this paper, no explicit diffusion is applied” This is a critical flaw of the experiments, which makes the results unsound.
L117: “EnviFlux does not include (sub-grid) turbulent or convective transport process” I think that if a model does not include these processes, it cannot appropriately simulate any atmospheric constituent transport. This flaw critically affects its inversion results.
L165: Does “total column amount (TCA)” mean vertically integrated amount? Usually, satellite products provide column averaged amount.
L169: “The TCA observations are found from a discretization of …” The author should note that satellite data are made with averaging kernels, which should also be incorporated in the model.
Table 1: Please elaborate the experimental settings of Table 1 in detail in the main text. Please explain the validity of experimental length of 100 days and source time-step of 30 days (while temporal correlation scale is 3 months, I’m wondering if these settings are reasonable). “nx = 33, ny=65” looks strange to me, because the number of longitudes is usually larger than that of latitude. Furthermore, how long the assimilation window was set should be clarified. Please explain the validity of the number of 4D-Var iterations (=25). Is it enough to converge the parameters?
L256-257: “Here, each is taken as a point location in the horizontal…” Is the satellite orbit assumed here derived from GOSAT? Are data gaps due to clouds (typically existing in the tropics) considered?
L443: “a simplified tool” From my point of view, the model looks unmatured rather than simple. If the author claims this simplicity is advantageous, that should be elaborated.
L446: “cheaply without the need for high performance computing” To claim this, the author should describe what kind of machine was used and specify wall-clock time for each experiment. Furthermore, the author should also demonstrate those computational utilities by comparing with other existing models.
Citation: https://doi.org/10.5194/egusphere-2025-6352-RC2
Data sets
EnviFlux Vn1.0: Technical and User Guide Ross Bannister https://doi.org/10.5281/zenodo.18803399
Model code and software
Source code Ross Bannister https://doi.org/10.5281/zenodo.18803399
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 868 | 378 | 99 | 1,345 | 55 | 78 |
- HTML: 868
- PDF: 378
- XML: 99
- Total: 1,345
- BibTeX: 55
- EndNote: 78
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Ross Noel Bannister
EnviFlux is described, and then used to show how model error and bias can influence the inferred surface flux features. This is done in two scenarios, one with no prior knowledge of a source/sink pair, and another with prior knowledge in a more realistic situation.
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
First, it is important to note that although you have provided the link for a permanent repository for the code described in your manuscript, you have failed to include a proper citation in the "Code and Data Availability" section of your manuscript. Because of this, your manuscript should not have been accepted for Discussions, and a new version of the manuscript fixing the mentioned issue should have been requested to allow it into Discussions and peer review. As for Discussions and peer review this information is mandatory, I am making public here the link to the repository that you provided internally and which allows to access to the code: https://doi.org/10.5281/zenodo.18803399
Additionally, you have not provided a repository for the data used in your study. Instead, you simply provide a table in the text describing it. The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. Please, therefore, publish your data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
The 'Code and Data Availability’ section must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel
Geosci. Model Dev. Executive Editor