the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Quantification of regional terrestrial biosphere CO2 flux errors in v10 OCO-2 MIP models using airborne measurements
Abstract. Multi-inverse modeling inter-comparison projects (MIPs) provide a chance to assess the uncertainties in inversion estimates arising from various sources such as atmospheric CO2 observations, transport models, and prior fluxes. However, accurately quantifying ensemble CO2 flux errors remains challenging, often relying on the ensemble spread as a surrogate. This study proposes a method to quantify the errors of regional terrestrial biosphere CO2 flux estimates from 10 inverse models within the Orbiting Carbon Observatory-2 (OCO-2) MIP by using independent airborne CO2 measurements for the period 2015–2017. We first calculate the root-mean-square error (RMSE) between the ensemble mean of posterior CO2 concentration estimates and airborne observations and then isolate the CO2 concentration error caused solely by the ensemble mean of posterior terrestrial biosphere CO2 flux estimates by subtracting the errors of observation and transport in seven regions. Our analysis reveals significant regional variations in the average monthly RMSE over three years, ranging from 0.90 to 2.04 ppm. The ensemble flux error projected into CO2 space is a major component that accounts for 58–84 % of the mean RMSE. We further show that in five regions, the observation-based error estimates exceed the atmospheric CO2 errors computed from the ensemble spread of posterior CO2 flux estimates by 1.37–1.89 times, implying an underestimation of the actual ensemble flux error, while their magnitudes are comparable in two regions. By identifying the most sensitive areas to airborne measurements through adjoint sensitivity analysis, we find that the underestimation of flux errors is prominent in eastern parts of Australia and East Asia, western parts of Europe and Southeast Asia, and midlatitude North America, suggesting the presence of systematic biases related to anthropogenic CO2 emissions in inversion estimates. The regions with no underestimation were southeastern Alaska and northeastern South America. Our study emphasizes the value of independent airborne measurements not only for the overall evaluation of inversion performance but also for quantifying regional errors in ensemble terrestrial biosphere flux estimates.
- Preprint
(1494 KB) - Metadata XML
-
Supplement
(760 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on egusphere-2023-2258', Andrew Jacobson, 15 Nov 2023
1. The OCO-2 v10 MIP sampled a much wider set of aircraft data than
those used in this study. In particular NOAA operates a light aircraft
program that produces regular profiles of CO2 measurements over North
America and Raratonga. These data should be well suited to the
analysis conducted here due to the regular sampling frequency, nearly
continuous coverage, and altitudes sampled. For some reason, of these
timeseries stations, only the data from Dahlen, North Dakota (DND) and
Marcellus, Pennsylvania (MRC) were included in Table 1 of the
manuscript. In addition to these two sites, there are evaluation data
in the OCO-2 MIP samples from timeseries over:Briggsdale, Colorado - (CAR)
Offshore Cape May, New Jersey - (CMA)
Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) - (CRV)
Estevan Point, British Columbia - (ESP)
East Trout Lake, Saskatchewan - (ETL)
Homer, Illinois - (HIL)
INFLUX (Indianapolis Flux Experiment) - (INX)
Park Falls, Wisconsin - (LEF)
Offshore Portsmouth, New Hampshire (Isles of Shoals) - (NHA)
Poker Flat, Alaska - (PFA)
Rarotonga - (RTA)
Offshore Charleston, South Carolina - (SCA)
Southern Great Plains, Oklahoma - (SGP)
Offshore Corpus Christi, Texas - (TGC)
Trinidad Head, California - (THD)
West Branch, Iowa - (WBI)2. This reviewer's experience with simulation of aircraft measurements
is that model residuals are strongly affected by altitude and by
season. The analysis here does not discriminate by either of these
factors, except to choose an altitude range apparently chosen to
minimize the effect of residuals closer to the surface. Should the
model residuals have significant variability by these factors, the
evaluation criteria would be affected and possibly dominated by those
factors, which would confound the statistical conclusions of this
work. I suggest that a factor analysis, possibly an analysis of
variance, is needed to determine whether model residuals are driven by
these factors.
3. Lines 124-125: "measurements made between 1 and 5 km altitude" does
not specify whether this means above ground level or above sea
level. This needs to be specified. Furthermore, if this altitude range
is above sea level then it is entirely possible that highly-variable
PBL measurement data are included in the evaluation data, since many
aircraft data were collected over topography with surface elevations
of hundreds of meters ASL. This would cloud the analysis with noisy
measurements having strong signals of local exchange.4. It is not clear whether the analysis excludes measurements that
were assimilated in the LNLGIS experiment. This is a fundamental piece
of information needed to understand the analysis and should absolutely
be explicitly stated. If assimilation data are included, then the
entire analysis needs to be considered differently.5. The INPE PFP used in this study data have not been screened for water
vapor contamination. This is a known problem with PFPs in humid
environments and can lead to both a low bias and spurious variability
in CO2 measurements. This is a particular concern with tropical
aircraft samples due to expected high humidity of sampled air. There
are indications that water vapor contamination can persist in PFP
flasks so that even dry high-altitude samples may be affected. This
water vapor issue in aircraft PFPs has been documented in Baier et
al. (2019,
https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2019JD031339) and
reported at various meetings
(e.g. https://gml.noaa.gov/publications/annual_meetings/2019/abstracts/74-190401-B.pdf).As reported to the authors in OCO-2 meetings, about one-third of
historical NOAA PFP measurements have been flagged due to suspected
water vapor contamination. In the same meetings the authors were
cautioned about this issue affecting INPE PFP data. In ObsPack
products, INPE PFP data are all flagged as "do not assimilate",
indicating that they are neither suitable for assimilation nor for
evaluation purposes. Finally, these data are distributed in a special
ObsPack product labeled "restricted" in part to warn users about the
problem.6. The CO2 measurement data used in this study have not been correctly
cited. It also is not clear whether ObsPack data providers have been
properly acknowledged. The OCO-2 ObsPack product is a "composite"
product created from seven source ObsPacks. The source products need
to be cited following the instructions at
https://gml.noaa.gov/ccgg/obspack/citation.php (available also in the
distributed metadata). Use of an ObsPack product also includes usage
terms which suggest that it may be appropriate to offer coauthorship
to the data providers. The seven source ObsPacks are listed in the
metadata directory of the downloaded product. In the current draft,
only the obspack_co2_1_GLOBALVIEWplus_v6.1_2021-03-01 product is
cited, whereas apparently there are data used from five other
ObsPacks: the NRT product, the Manaus product, the INPE product, the
CONTRAIL product, and the AirCore product.Citation: https://doi.org/10.5194/egusphere-2023-2258-CC1 - AC1: 'Reply on CC1', Jeongmin Yun, 17 Dec 2023
-
RC1: 'Comment on egusphere-2023-2258', Anonymous Referee #1, 16 Jan 2024
Yun et al have written an elegant and thought-provoking paper on error quantification for the OCO-2 MIP v10 dataset. The approach is a mixture of theory, where possible, and approximations, otherwise. It is certainly an interesting contribution, but its crude assumptions limit its scope to a scenario (“if we could assume that…, then we could conclude that…”). In fact, in practice, the main result seems to be the inference of... one element of the MIP protocol, namely the fact that fossil fuel fluxes were imposed on all inverse modeling systems. I can only encourage the authors to thoroughly revise their text in order to give it the right perspective.
Detailed comment:
- Title: the flux errors here refer to the average of the flux ensemble, not to the individual flux sets. Please correct
- 27: this main conclusion is not a finding but an input to the study (l. 118).
- 53: RECCAP seems to be driven by GCP (https://www.globalcarbonproject.org/reccap/overview.htm) – what is the difference with the previous item (l. 52)?
- 66: please insert “explicitly” before “incorporate”, as the difference between systematic errors and error correlations can be subtle
- 77: it would be fairer to write that this method has no theoretical basis. “lacks” may suggest that there is hope to find one (or, please, elaborate).
- 93: the given definition of “error” is surprising, because “observed” is vague (by which technique?), and because the previous sentence is about fluxes.
- Figure 1, point 1): is actually about flux+transport errors
- 116: ten members only, covering only four transport models. How can their statistics be robust? Briefly touched in l. 483-7, but too late.
- 148: please insert “below” before “to evaluate”
- 160: the way this exclusion is done biases the statistics towards the model values. Awkward.
- 180-1: strong approximation. Also, the fact that the simulations were made at half-degree resolution, which is so much coarser than the measurements
- 197-198: This method mainly relies on this approximation, but I see no justification. You need to convince the reader that it is reasonable. Based, e.g., on Schuh et al (2019), “The research suggests that variability among transport models remains the largest source of uncertainty across global flux inversion systems”, cited in l. 49, I would be surprised if it was, but please explain why I am wrong!
- 215: the concept of forward simulations obtained with a (backward-running) adjoint model is not intuitive. Did you use the adjoint to compute the Jacobian matrix and then did you run it forward?
- 220: if I understand it well, sub-monthly patterns are fixed, even though the comparison is to instantaneous measurements. The spread should be largely underestimated.
- 226: again, I do not trust this hypothesis
- Eq. (13): my previous comments challenge it
- 289: RMSE has already been defined
- 300: Liu et al. (2022) is missing. I am looking for it to read the basis of “indicating most inverse models have common significant errors for this region”.
- 493: based on the above, I would challenge this statement.
Citation: https://doi.org/10.5194/egusphere-2023-2258-RC1 -
RC2: 'Comment on egusphere-2023-2258', Anonymous Referee #2, 25 Feb 2024
Review.
This is an interesting and creative manuscript that I believe makes useful progress toward a challenging and important objective - evaluating uncertainty in the results of inverse estimates of biogenic CO2 fluxes. I believe the results and conclusions are justified given what I can gather from the data presented. My primary concern is the clarity of the text, both the methods and the results. At worst I could not understand some of the methods and results, and in other areas I think that I understand, but the presentation makes understanding a struggle.
I would encourage the authors to consider revising some of the presentation to make this important work more accessible. I have two main concerns.
1. The authors work hard to explain the methods, but I struggled to follow. The figure is a good idea, and the appendix is very helpful. I found, however, that the terminology used, including the mathematical symbols used to define terms, was revealed gradually and irregularly. This makes reading the document difficult. I would strongly recommend presenting the most important variables and their definition up front and early in the text, and making sure to stick to that terminology and variable set throughout. I would make it easy for the reader to quickly look up the meaning of the most important variables used in the main results.
2. Some of the presentation of results needs, in my opinion, to be rewritten. Some of the results are not organized into clearly written paragraphs, with a key finding as the topic sentence and discussion in the paragraph that explains the reasoning behind that key finding. Instead there are paragraphs that tend toward describing the figures, raising conclusions mid-paragraph or at the end of the paragraph, and those conclusions are clearly linked (in my mind) to the preceding text. I believe that rewriting some of the results and discussion (see detailed notes) will make the document easier to follow and more clearly illustrate what appear to be an interesting set of results derived from a creative set of methods.
3. I have one question about the content. The number of airborne observations (which is not well defined, see my notes below) vary dramatically from region to region. I would expect this to have a much larger impact on the results than it appears to have. Heavily sampled regions (e.g. N America) don’t appear much better understood than severely undersampled regions (e.g. S. America). Should we infer that intensive aircraft campaigns are not very beneficial, and that very limited sampling provides sufficient information for evaluating uncertainties in inversions? Or that large investments in sampling does not greatly improve our understanding? Or is it safer to say that we have not yet learned how to use extensive data set to our greatest benefit?
In sum I find the document very much worthy of publication, but in need of work on the presentation.
Detailed comments:
1. Lines 23 and 25. Are these references to fluxes specific to biogenic CO2 fluxes? At a few places in the abstract it isn’t clear what fluxes are included. This gets especially confusing on line 27 when anthropogenic CO2 emissions are specified.
2. Line 36-37. English needs some work.
3. Line 40. final phrase is left dangling.
4. Line 48. I’m not sure what “systematic errors in …. inversion setups” means.
5. Lines 85-90. This is tough to follow. But let me try the methods, then perhaps this will be clearer.
6. Lines 91-92. If the objective focuses on the use of airborne observations, it might help to include some description of these observations and their suitability for this task in the introduction.
7. Line 96. I’m puzzled by the statement, “an approximation of RMSE.” Maybe, “RMSE in the elements of the ensemble”? I’m not sure that is clearer.
8. Line 99. Next? Did you just present this as (2) in line 96?
9. Line 101. What are the true errors? Does this differ from the ratio exercise described earlier?
10. Figure 1. This is a nice idea, but the terms in this figure need to be defined. At present these terms don’t match the terms in the text, and there are many undefined terms in the figure.
11. Line 126. I would recommend adding citations that document these field campaigns.
12. Figure 2. The caption refers to the number of airborne measurements. What constitutes one airborne measurement? Many aircraft campaigns have continuous observations and gigabytes of data. Please explain the quantization of the data that is used in this figure. If this number of the number of 1x1 degree grids with an observation, what is the temporal unit for an observation? If the same location is measured for 100 hours over 10 days within one month, is that one measurement or ten or 100 measurements?
13. Table 1. Please include citations for data sets when possible. I am sure, for example, that there is a data citation available for AToM observations.
14. Line 149. “simulated atmospheric CO2 mole fractions”? And please explain, “the observed one”. What is “the observed one”?
15. Line 153. “the 1x1 grid cell”
16. Line 153. What constitutes one airborne measurement? The continuous aircraft campaigns have MANY more measurements than is suggested by Figure 2. Please explain your definition of one measurement.
17. Line 159-160. I don’t believe that the ensemble mean accounts for transport errors. The ensemble includes them, at least as represented by the ensemble.
18. Line 161. I don’t object to removing these outliers, but I’m not sure this ensures robust error estimates.
19. Line 175-180. I may just be tired, but I am having a very hard time following this discussion. This is an interesting approach to evaluating uncertainty. It would be great if this could be explained more clearly. Figure 1 is an interesting complement to this text, but it isn’t cited at all in this text. Perhaps you could clarify your methods by connecting the terms in Figure 1 explicitly to this text and to Appendix A.
20. Line 202. Please explain, “the regional average of error matrices.”
21. Figure 2a. Some regions have very, very few observations. What does that do to your results?
22. Lines 312-313. Are the RMSE values between 1 and 3 ppm? Or is 1-3ppm the range of the values of RMSE?
23. Figure 4. caption. I think these are monthly values of RMSE. Monthly variations of RMSE sounds to me like the variance of the RMSE.
24. Paragraph starting on line 336. What is the main point of this paragraph? I have the same concern for all the paragraphs up to line 384. These paragraphs tend to describe the contents of the figures. It is hard for me to extract the main result. I suggest starting each of these paragraphs with a topic sentence that presents your main finding, then use the paragraph to explain this finding.
25. Line 386. I cannot find in section 2 where to find the method for determining the most influential areas for observed atmospheric CO2. And again, this is not a result.
26. Line 393-394. I do not understand what is meant by the sentence starting with “Figure 6a…” and I don’t understand the associated figure. Further, the text following this statement describes methodology, not results. Can you please explain Figure 6 methodology in the methods section of the text?
27. Lines 419-421. I don’t understand how this follows from the preceding text. If this is the main finding, please begin the paragraph with this statement, then use the paragraph to explain this statement. At present, I cannot follow this argument. It is an interesting argument. Please explain it more clearly.
28. Lines 424-428. This is material for the introduction, not the discussion.
29. Line 435-437. This sentence needs work.
30. Line 437. This result could be a natural consequence of what?
31. Line 444. I don’t follow the “This underestimation…” sentence. Please clarify.
32. Line 444. What do you mean by “the common assumptions and observations…”? Are you arguing that since many ensemble members share common data and common methodological assumptions, this results in the spread among them being an underestimate of the true uncertainty in fluxes? This is possible and an interesting assertion, but I don’t think it is proven by this work.
33. Line 449. What is a “main source region”?
34. Lines 458-459. I don’t understand the origins of the 15% figure, or the meaning of “challenges” in estimating monthly flux errors. I very much agree with the concern at the end of this paragraph that areas with limited data may not have sufficient data for computing reliable error statistics for the flux inversions. I think these are related topics. Please clarify.
35. Line 477. “Second…” This is another paragraph.
36. Line 491-492. I am not convinced that these flux errors are largest in errors with large anthropogenic fluxes. This is a plausible hypothesis, but I would not say that the results reveal this to be true. I would like to see a more careful analysis of the fossil fluxes in the relevant influence regions, and the relationship between the strength of fossil fluxes and these flux errors to be convinced.
Citation: https://doi.org/10.5194/egusphere-2023-2258-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
524 | 191 | 69 | 784 | 87 | 58 | 56 |
- HTML: 524
- PDF: 191
- XML: 69
- Total: 784
- Supplement: 87
- BibTeX: 58
- EndNote: 56
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1