the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
High-resolution greenhouse gas flux inversions using a machine learning surrogate model for atmospheric transport
Abstract. Quantifying greenhouse gas (GHG) emissions is critically important for projecting future climate and assessing the impact of environmental policy. Estimating GHG emissions using atmospheric observations is typically done using source-receptor relationships (i.e., "footprints''). Constructing these footprints can be computationally expensive and is rapidly becoming a computational bottleneck for studying GHG fluxes at high spatio-temporal resolution using dense observations. Here we demonstrate a computationally efficient GHG flux inversion framework using a machine learning emulator for atmospheric transport (FootNet) as a surrogate for the full-physics model. The footprints generated by FootNet are at approximately one-kilometer resolution. We update the architecture of the deep-learning model to improve the performance in a GHG flux inversion. Paradoxically, the updated FootNet model out-performs the full-physics model when used in a flux inversion and compared against independent observations. This improved performance is likely because atmospheric transport simulated with a full-physics transport model is not necessarily more accurate. The more simplistic representation of transport in the machine learning model helps to mitigate transport errors. This flux inversion using a machine learning surrogate model only requires meteorological data, GHG measurements, and prior fluxes. Constructing footprints using FootNet is 650× faster than the full-physics atmospheric transport model on similar hardware. This speedup allows for computation of footprints "on-the-fly'' during the GHG flux inversion (i.e., computed as needed, rather than archiving for future use) and makes near-real-time emission monitoring computationally possible. This work alleviates a major computational bottleneck with inferring GHG fluxes with next generation dense observing systems.
- Preprint
(4827 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2918', Anonymous Referee #1, 08 Oct 2024
The authors introduce a machine-learning approach for atmospheric inversion in which Lagrangian “footprints” are emulated on the fly, thereby avoiding tedious precomputations and storage.
The paper seems to be based on a report, He et al (2024b), quoted 12 times. The report has not been peer-reviewed, which can be easily understood when reading it. Its lack of scientific maturity is unfortunately shared by this submission as I will explain below. The paper remains interesting by reflecting the current rise in AI techniques, for instance in the field of operational meteorology, but I encourage the authors to revise their approach and their text thoroughly.
Major comments
- The framework is clearly Lagrangian and completely ignores the Eulerian framework that is usually implemented to address some of its limitations, for instance for the processing of large amounts of observations like satellite data. The two approaches, Lagrangian and Eulerian, are complementary and their respective limits should not be considered in isolation.
- The abstract does not reflect the approach fairly. For instance, the high spatial resolution of FootNet should not be reported without its near-absence of temporal resolution.
- The logic behind the use of Gaussian plume models in some studies (l. 68-70) has not been understood: it has nothing to do with “favourable atmospheric conditions” but rather with the possibility to adjust wind direction.
- The test dataset seems to have been extracted from the same time period as the training dataset. This may explain the very short list of input data (Table 1) but suggests a very limited generalization capability necessitating continuous retraining and therefore reducing the computational advantage.
- The absence of temporal resolution in the FootNet footprints is a major limitation that will appear to many as a show-stopper. Who would accept the authors’ empirical fix after having visualized what it means in practice in Figure 1?
- The authors do not discuss mass conservation in FootNet, but this is the core of atmospheric inversion. Who would accept footprints that do not conserve tracer mass?
- The explanation of the better performance of FootNet in one experiment blames the physical model, but could well be due to a compensation between FootNet errors, in particular in time, and errors in the assignment of prior error correlations.
- Sensitivity tests on some minor aspects of the initial non-peer-reviewed FootNet are more like filler.
- After 11 pages, the list of FootNet input data in Table 1 may come as a shock. The diurnal cycle of meteorology is only sampled four times. Wind fields in input are used at one level only (10 meters). Boundary layer turbulence is represented by planetary boundary layer height only. Convective exchanges are not represented at all. Surprisingly, the authors preferred to introduce some Gaussian plume variables without clearly understanding the need to make them explicit to FootNet.
- Joined with the absence of temporal resolution in output, we are closer to a toy model than to ensuring that climate goals are met (l. 309).
I am skipping minor comments.
Citation: https://doi.org/10.5194/egusphere-2024-2918-RC1 -
CC1: 'Reply to borderline inappropriate comment from Reviewer #1', Alexander Turner, 09 Oct 2024
To preface this, we feel that the tone and tenor of this review are borderline inappropriate.
The manuscript under review here is part of a bigger effort to ultimately develop a machine learning model that can efficiently predict the footprints for any region and be used within an inversion system. The first paper (He et al., under review at GMD: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-1526/) was a demonstration that we can emulate these footprints in both simple and complex environments; in other words, this was a proof of concept. The second paper (this manuscript) demonstrates how this emulator can be used with real data in a GHG flux inversion. These are the three papers that ultimately build into a generalizable system that can be efficiently utilized for any region:
- Paper #1) proof of concept that we can emulate footprints for an easy case study and a difficult case study. It was not clear that we could do this at the outset of the work.
- Paper #2) demonstration that this emulator can be used in a GHG flux inversion with minimal error induced. This is the paper under review here.
- Paper #3) generalize this system to both surface and satellite footprints to work for any region over CONUS. This is the ultimate goal of our work.
We have promising results towards that 3rd paper demonstrating a truly generalizable ML approach. However, there were a number of steps that needed to happen first. These first two papers address some of the crucial milestones.
Below, we correct the misunderstandings and detail the seemingly minor responses to the specific comments from the reviewer:
- Reviewer: The framework is clearly Lagrangian and completely ignores the Eulerian framework that is usually implemented to address some of its limitations, for instance for the processing of large amounts of observations like satellite data. The two approaches, Lagrangian and Eulerian, are complementary and their respective limits should not be considered in isolation.
- The reviewer is correct that the framework is clearly Lagrangian. This is why the focus of the text is on Lagrangian models. To our knowledge, there are no Eulerian frameworks currently being used to estimate spatio-temporally resolved GHG fluxes at 1-km spatial resolution, thus we focused on Lagrangian frameworks. The IMI framework (Varon et al., 2022; Estrada et al., submitted) can estimate methane emissions at ~25×25 km2 spatial resolution and is probably the closest Eulerian analog.
- We will add these references to our manuscript.
- Reviewer: The abstract does not reflect the approach fairly. For instance, the high spatial resolution of FootNet should not be reported without its near-absence of temporal resolution.
- It seems that the reviewer missed Section 4 and Figs 1-2. Section 4 is a direct test of this choice of temporal resolution. Using the full-physics model, the formulation of the temporal information proposed here outperforms the temporally resolved version.
- We will update the text to clarify these experiments and the performance.
- Reviewer: The logic behind the use of Gaussian plume models in some studies (l. 68-70) has not been understood: it has nothing to do with “favourable atmospheric conditions” but rather with the possibility to adjust wind direction.
- Quoting from the conclusions of Nassar et al. (2017): “Gaussian plume models are commonly used in point source modeling (Bovensmann et al., 2010; Fioletov et al., 2015) and are attractive for their simplicity, but like all models, they have some limitations. Acknowledging such limitations, we have applied the model only to flat regions and to moderate distances and times (up to ~50 km and ~3 h) since our implementation assumes constant emissions and a constant wind speed and direction”. Papers that use this approach acknowledge that it is only usable for flat regions with constant emissions and a constant wind speed.
- We will update our text to explicitly state this.
- Reviewer: The test dataset seems to have been extracted from the same time period as the training dataset. This may explain the very short list of input data (Table 1) but suggests a very limited generalization capability necessitating continuous retraining and therefore reducing the computational advantage.
- As mentioned above, we focused on two case studies for the initial work to demonstrate feasibility in simple and complex conditions. Failure in these case studies would have indicated that we should change directions. A forthcoming analysis demonstrates strong performance over CONUS and, as such, the model would not need retraining. The reviewer is incorrect in their assertion that the approach would require re-training. Again, that work is still in progress and this manuscript demonstrates a key functionality and overcomes challenges with using an ML-surrogate model in a GHG flux inversion. The test dataset is indeed for the same region (SF Bay Area) as the training dataset. However, the test dataset is NOT the same time period as the training dataset. We train using data from 2018 and 2019. The test dataset is 2020.
- We will clarify this in the updated manuscript.
- Reviewer: The absence of temporal resolution in the FootNet footprints is a major limitation that will appear to many as a show-stopper. Who would accept the authors’ empirical fix after having visualized what it means in practice in Figure 1?
- As mentioned above, Section 4 and Figs 1-2 are direct tests of this choice of temporal resolution. This was tested using the full-physics model, not the emulator. So we can state with confidence that using this simplified representation of the temporal information performs better in a GHG flux inversion. The reviewer is grossly misinformed in their assessment that “this is a major limitation” and a “show-stopper”.
- We will update the text to clarify these experiments and the performance.
- Reviewer: The authors do not discuss mass conservation in FootNet, but this is the core of atmospheric inversion. Who would accept footprints that do not conserve tracer mass?
- This is a fair point. We are currently assessing formulations of the loss-function that include a penalty for not conserving mass. This is ongoing work that is being included for the 3rd paper mentioned above. Nevertheless, the evaluation of the emulator in the GHG flux inversion is a better metric for performance. This is why the GHG flux inversion is our gold standard for this work.
- We will update the text to explain why the GHG flux inversion is our gold standard for comparison here.
- Reviewer: The explanation of the better performance of FootNet in one experiment blames the physical model, but could well be due to a compensation between FootNet errors, in particular in time, and errors in the assignment of prior error correlations.
- The improved performance is not due to the treatment of the temporal structure as both STILT and FootNet use the approach detailed in Section 4 for this part of the analysis.
- This is already mentioned in the text, but we will clarify it in the updated manuscript.
- Reviewer: Sensitivity tests on some minor aspects of the initial non-peer-reviewed FootNet are more like filler.
- We disagree with the reviewer. These “minor aspects” were important updates that improved the performance of the emulator in our use case: a GHG flux inversion. This is why we included them in the manuscript.
- No change in response.
- Reviewer: After 11 pages, the list of FootNet input data in Table 1 may come as a shock. The diurnal cycle of meteorology is only sampled four times. Wind fields in input are used at one level only (10 meters). Boundary layer turbulence is represented by planetary boundary layer height only. Convective exchanges are not represented at all. Surprisingly, the authors preferred to introduce some Gaussian plume variables without clearly understanding the need to make them explicit to FootNet.
- The reviewer is correct that it is both surprising and interesting that we are able to reproduce these footprints with high fidelity using so few parameters. This is part of why the result is novel and should be published. The HRRR meteorology is saved at 6-hourly time intervals. Using STILT with HRRR data for a 24-hour back trajectory would similarly sample the meteorology 4 times. As such, the setup for the emulator is not that dissimilar from the full-physics model. Having worked extensively with these LPDMs, the most important parameter for the STILT model is the PBL height. This is directly used in the calculation of the footprint. Therefore it is unsurprising that PBL height was found to be the most important parameter related to convection in the emulator.
- No change in response.
- Reviewer: Joined with the absence of temporal resolution in output, we are closer to a toy model than to ensuring that climate goals are met (l. 309).
- As mentioned above, the reviewer has misunderstood the tests that were conducted for the temporal resolution. The reviewer’s characterization of the work as a “toy model” is grossly misinformed.
- We will clarify the text on the temporal experiments.
We will submit an updated manuscript with revisions to the text after receiving comments from the other reviewer.
Citation: https://doi.org/10.5194/egusphere-2024-2918-CC1 -
RC2: 'Reply on CC1', Anonymous Referee #1, 09 Oct 2024
The authors are upset and explain it harshly and quickly. I will still try to continue the scientific discussion based on the new information they provide as part of this specific submission to ACP.
In their submission, they referred 12 times to
He, T.-L., Dadheech, N., Thompson, T. M., and Turner, A. J.: FootNet: Development of a machine learning emulator of atmospheric transport, 2024b.
Without any other details, I initially found this report on the Internet
https://eartharxiv.org/repository/view/6392/
Now the authors explain that they referred to that one
https://egusphere.copernicus.org/preprints/2024/egusphere-2024-1526/
and that they are basically submitting a 3-part paper to at least two journals. The authors may understand that their presentation and their overall strategy are a source of confusion and may be challenged. Nothing inappropriate here in my previous text.
I am numbering my initial points to respond to their answers
- The initial paper basically implied that machine learning is the only alternative to some technical limitations of the Lagrangian framework. From a methodological point of view, the authors should not ignore the Eulerian option that has been used in the air quality domain for a long time and seems to emerge at km-scale for GHGs in some groups. Nothing inappropriate here in my previous text.
- This point is about the abstract, not about Section 4 and Figs 1-2. Nothing inappropriate here in my text.
- The cited text does not change the fact that, as far as I understand, the possibility to optimize wind direction was a major motivation behind the use of Gaussian models in these papers. Nothing inappropriate here in my text.
- The authors refer to a forthcoming analysis, which I do not have access to, so the discussion is complicated here as well. They also explain that the year of the test dataset is not the year of the training dataset, but where was this described? Nothing inappropriate here in my text.
- I maintain that Section 4 and Figs 1-2 are not convincing at all. The exponential decay is a very rough approximation that leaves the FootNet footprints so different from the reference ones that it tries to simulate. Nothing inappropriate here in my text.
- I am glad to see that at least this comment was found appropriate by the authors, but the fact that it will be answered in the third paper does not help.
- The fact that both STILT and FootNet use the same prior error covariance matrix would not prevent this matrix to be erroneous. Nothing inappropriate here in my text.
- My initial text remains appropriate: “Sensitivity tests on some minor aspects of the initial non-peer-reviewed FootNet are more like filler”.
- The authors seem to share my surprise about the limited set of predictors, which confirms that my text was appropriate. The results are interesting in this respect, but we expect more analysis (understanding) in a peer reviewed paper, in particular because we are touching the question of the capability of the method in other regions or periods of the year. It may be unsurprising that PBL height was found to be the most important parameter related to convection in the emulator, but this does not rule out a significant role of other parameters, for instance linked to temperature. Similarly, the choice of the sole 10-m level for wind needs to be substantiated: does it statistically contain wind information on upper levels, in these two cases? in all cases?
- The authors misquoted my text and we obviously disagree on what ensuring that climate goals are met implies in practice (eg in terms of accuracy). Nothing inappropriate here in my text.
Citation: https://doi.org/10.5194/egusphere-2024-2918-RC2 -
CC2: 'Reply on RC2', Alexander Turner, 09 Oct 2024
Reviewer: I maintain that Section 4 and Figs 1-2 are not convincing at all. The exponential decay is a very rough approximation that leaves the FootNet footprints so different from the reference ones that it tries to simulate.
There seems to be lingering confusion about Section 4 and the temporal information. As mentioned in our reply, we will clarify this in the updated manuscript. Section 4 is only using the full-physics LPDM: the STILT model. All results shown in Section 4 and Figs 1-2 are from STILT. We conducted two inversions:
- Inversion A: using STILT with time-resolved footprints
- Inversion B: using STILT with temporal information represented using the exponential decay
Figure 2 shows the results of these two inversions. The scatterplots show a cross validation of the simulated CO2 concentrations using the prior and posterior fluxes (i.e., the observations in panels D and H were withheld from the flux inversion). We set the seed such that the same observations are sampled in both cases. We note that Inversion B outperforms Inversion A by all measures. That is to say, using the exponential decay approach does a better job of representing observed CO2 concentrations in the atmosphere (in this case they are from the BEACO2N network). From this, we conclude that not only is the exponential decay a reasonable method for approximating the temporal patterns in the footprint, it out-performs the time resolved footprints when used in a flux inversion. Again, this conclusion was drawn using STILT, not the emulator. This was a surprise to us and has held up to other tests we conducted (not shown here). This finding is important for others conducting GHG flux inversions with LPDMs.
The reasoning for this improved performance is that while the time resolved footprints are more realistic, they are not necessarily more accurate representations of what actually happened in the atmosphere. Small errors in meteorology could lead you to misattribute those fluxes to the wrong locations. Using a simpler representation of this temporal information may be important for a flux inversion system. Again, this is an important point for others conducting GHG flux inversions.
Returning to FootNet, we conclude that this exponential decay approach is a viable approach for FootNet. The reviewer's characterization of our approach having "no temporal resolution" is incorrect. After Figure 2, all of the flux inversions shown in the manuscript use the exponential decay approach (including the top row of Figure 8). This means that any differences in performance between STILT and FootNet are not due to the representation of temporal information.
Finally, this inversion setup mirrors that of Turner et al. (2020). So this case study is based on a realistic inversion from the published literature.
Again, we will clarify this in the updated manuscript. Apologies for any confusion in the initial description. All other comments were previously addressed in our reply.
Citation: https://doi.org/10.5194/egusphere-2024-2918-CC2
-
RC3: 'Review comment on egusphere-2024-2918', Anonymous Referee #2, 15 Oct 2024
General comments
Authors further revised and improved a Footnet model that is used to efficiently calculate the surface flux footprints for the Bay Area urban CO2 network. The Footnet model footprints are used here in inversion and resultant fluxes and fit to observed CO2 are compared to the inverse model based on footprints calculated in a traditional way with a STILT LPDM. It is shown that the emulated footprints perform on par with LPDM-simulated when applied in inversion and may even have an advantage on some metrics. The work is a valuable contribution extending a still limited line of studies using neural networks learning from LPDM outputs to simulate flux footprints at much reduced CPU time. The technology is important for advancement of emission studies with high resolution (1-10 km) wide swath satellite observations of GHGs which involves processing of large volume of data, thus requiring enormous amount of CPU time and storage. The work relies on high resolution HRRR meteorology available over CONUS and well-established STILT transport, so it can be handily extended to many other regional applications. The paper is well written and can be accepted with minor revision, clarifying several points suggested in the detailed comments.
Detailed comments
Line 9-11 (abstract) Authors mention that, surprisingly, “the updated FootNet model out-performs the full-physics model when used in a flux inversion”. However, they don’t have a rational explanation for this and speculate as: “This improved performance is likely because atmospheric transport simulated with a full-physics transport model is not necessarily more accurate. The more simplistic representation of transport in the machine learning model helps to mitigate transport errors”. Suggest to drop this speculative discussion, as it may happen that in the next version or a case study the ML and full-physics footprints will have other biases and advantage of the ML will be lost.
Line 21-22 Giving the list of references here, authors implicitly limit the type of models useful for inverse modeling at high resolution to Lagrangian, while there are successful examples of using Eulerian models for the same purpose (eg Steiner et al, 2024). On the other hand, the possible applications of ML-based footprint simulators do not have to be regional as there are global models using Lagrangian footprints (eg Nayagam et al, 2024, Janardanan et al 2024) facing same or bigger computational challenges as for the regional ones.
Lines 28-30 Can add satellite-based studies of point sources (eg Janardanan et al 2016).
Lines 57-58 Note that for large n, one may opt to using forward transport instead, either Lagrangian or Eulerian or plume-based like PMIF (Wang et al, 2020)
Line 224 Table 1. Looking from the experience of applying limited set of parameters for describing PBL mixing before 3-D dynamic models of turbulence (eg Hanna, 1984), the choice of driving variables does not look optimal. Why don’t include surface stress, Monin-Obukhov length, 100 m or mid-PBL winds, for example?
Line 315-320 There is an impression that the ad hoc replacement of time-resolving footprints with a decay-based model will not be universally applicable, and it should be mentioned as a limitation of the proposed method.
References
Hanna SR, Applications in air pollution modeling. In: Nieuwstadt FTM, van Dop H (Eds) Atmospheric turbulence and air pollution modelling. Atmos Sci Library, vol 1. Springer, Dordrecht. 10.1007/978-94-010-9112-1_7, 1984
Janardanan et al., Comparing GOSAT observations of localized CO2 enhancements by large emitters with inventory-based estimates, GRL, 43, 3486-3493, 10.1002/2016GL067843, 2016
Janardanan et al., Country-level methane emissions and their sectoral trends during 2009–2020 estimated by high-resolution inversion of GOSAT and surface observations, Environ. Res. Lett., 19, 10.1088/1748-9326/ad2436, 2024
Nayagam et al., A top-down estimation of subnational CO2 budget using a global high-resolution inverse model with data from regional surface networks, Environ. Res. Lett., 19, 10.1088/1748-9326/ad0f74, 2024.
Steiner et al., European CH4 inversions with ICON-ART coupled to the CarbonTracker Data Assimilation Shell, Atmos. Chem. Phys., 24, 2759–2782, 10.5194/acp-24-2759-2024, 2024.
Wang et al.: PMIF v1.0: assessing the potential of satellite observations to constrain CO2 emissions from large cities and point sources over the globe using synthetic data, Geosci. Model Dev., 13, 5813–5831, 10.5194/gmd-13-5813-2020, 2020.
Citation: https://doi.org/10.5194/egusphere-2024-2918-RC3 -
RC4: 'Comment on egusphere-2024-2918', Anonymous Referee #3, 18 Nov 2024
General Comment:
The manuscript “High-Resolution GHG Flux Inversions Using a Machine Learning Surrogate Model for Atmospheric Transport” presents an important study showcasing the potential of ML-based emulators for atmospheric transport in enabling fast computation of observation footprints used in flux inversions. The demonstrated gains in computational speed and storage compared to traditional methods relying on full-physics models are convincing and could pave the way for near-real-time GHG flux monitoring using dense observational systems, such as the new generation of satellite instruments. Provided the comments below are addressed, this paper would be suitable for publication in this journal.
One key claim of this work is that the proposed ML approach (FootNet v2) outperforms the full-physics model in the flux inversion. The authors attribute this to the smoother spatial structure of the FootNet v2 footprints, which they hypothesize helps mitigate transport error. Given the broad implications of this finding for the field, this important statement should be supported with more evidence than the statistical results obtained from a single case study. Two potential avenues for further substantiation are:
1.Extending the comparison to other cases reflecting different meteorological conditions.
2.Conducting an OSSE (Observing System Simulation Experiment).
I suggest the authors conduct at least an OSSE experiment, which would entail:
a) Generating synthetic observations from a reference “true” emission field.
b) Generating a prior ensemble of HRRR meteorology (or from another model) and fluxes.
c) Performing multiple inversions using both STILT and FootNet v2.
Each inversion would use a different meteorological realization to simulate transport error. The hypothesis presented in this study suggests that the ML surrogate, by smoothing the transport patterns, would produce better inversion results on average than STILT. This could be evaluated by comparing the standard deviations of inversion errors (relative to the known true fluxes) and the biases.
Although this approach is idealized, it would provide more robust statistical evidence to support the claim that the ML surrogate yields better performance than the full-physics model.
Minor Comments:
Introduction:
The authors should discuss the use of variational methods as an alternative approach to address high-dimensional inversion problems involving large flux and/or observation spaces. In this framework, transport Jacobians do not need to be explicitly constructed, and efficient minimization algorithms enable rapid computation of mean posterior fluxes.
Conclusion:
It would be useful to discuss the potential application of ML surrogates for atmospheric transport in performing MCMC inversions. This approach could provide full posterior probability density functions (PDFs) without constraining the prior PDFs to specific forms, such as Gaussian priors.
Citation: https://doi.org/10.5194/egusphere-2024-2918-RC4
Data sets
High-resolution greenhouse gas flux inversions using a machine learning surrogate model Nikhil Dadheech https://doi.org/10.5281/zenodo.13750963
Model code and software
High-resolution greenhouse gas flux inversions using a machine learning surrogate model Nikhil Dadheech https://doi.org/10.5281/zenodo.13750963
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
435 | 133 | 450 | 1,018 | 8 | 9 |
- HTML: 435
- PDF: 133
- XML: 450
- Total: 1,018
- BibTeX: 8
- EndNote: 9
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1