NMVOC emission optimization in China through assimilating formaldehyde retrievals from multiple satellite products

Xu, Canjie; Jin, Jianbing; Li, Ke; Qi, Yinfei; Xia, Ji; Lin, Hai Xiang; Liao, Hong

doi:10.5194/egusphere-2025-140

Preprints

https://doi.org/10.5194/egusphere-2025-140

Preprints

07 May 2025

| 07 May 2025

NMVOC emission optimization in China through assimilating formaldehyde retrievals from multiple satellite products

Canjie Xu, Jianbing Jin, Ke Li, Yinfei Qi, Ji Xia, Hai Xiang Lin, and Hong Liao

Abstract. Non-methane volatile organic compounds (NMVOCs) serve as key precursors to ozone and secondary organic aerosols. Given that China is a major source of NMVOCs, the emission inventory is crucial for understanding and controlling atmospheric pollution. Mainstream inventories are constructed using bottom-up approaches, which cannot accurately reflect the spatiotemporal characteristics of NMVOCs, resulting in poor model outcomes. This study performed monthly optimization of NMVOC emissions in China by assimilating formaldehyde retrievals from the latest satellite products. A semi-variogram spatial analysis is conducted before assimilation, highlighting the advantages of using Tropospheric Monitoring Instrument (TROPOMI) and Ozone Mapping and Profiler Suite (OMPS) formaldehyde products for estimating high-resolution NMVOCs compared to Ozone Monitoring Instrument (OMI) retrievals. The emission optimization is performed based on a self-developed 4DEnVar-based system. A positive increment of NMVOC emissions was obtained by assimilating OMPS formaldehyde, with annual anthropogenic emissions rising from 22.40 to 41.32 Tg, biogenic emissions increasing from 16.56 to 28.01 Tg, and biomass burning emissions rising from 0.29 to 0.65 Tg. Our model simulations, driven by the posterior inventories, demonstrate superior performance compared to the prior. This is validated through comparisons against the independent satellite measurements and the surface ozone measurements. The RMSE of the posterior formaldehyde columns decreased from 0.49 to 0.45 ×10¹⁶ molec/cm² nationwide. In the severe-polluted NCP, it was improved effectively, reaching levels comparable to TROPOMI, with the RMSE dropping from 0.52 to 0.37 ×10¹⁶ molec/cm². Validation using surface ozone observations also yielded favorable results, especially in NCP.

Received: 13 Jan 2025 – Discussion started: 07 May 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 15272 KB)

Supplement (4468 KB)

Download & links

Preprint (15272 KB)
Metadata XML
Supplement (4468 KB)
BibTeX
EndNote

Canjie Xu, Jianbing Jin, Ke Li, Yinfei Qi, Ji Xia, Hai Xiang Lin, and Hong Liao

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-140', Anonymous Referee #2, 29 May 2025
This manuscript presents an inverse modeling study aiming to optimize NMVOC emissions over China by assimilating satellite-based formaldehyde observations from OMPS and TROPOMI. While the topic is relevant and within the scope of Atmospheric Chemistry and Physics (ACP), the manuscript currently lacks sufficient methodological rigor, clarity in presentation, and justification of key assumptions. There is significant room for improvement before the work can be considered for publication. My detailed comments are provided below.
General Comments and Major Concerns

Clarity and Consistency in Satellite Usage (Major)

The manuscript lacks consistency in describing which satellite datasets are assimilated and which are used for validation. The abstract suggests that only OMPS is used for assimilation and TROPOMI is used as an independent validation dataset. However, the methods section refers to assimilation experiments involving OMPS, TROPOMI, and their combination. Furthermore, Eq. (3) implies the use of a single observational constraint. If the combination refers to an average of OMPS and TROPOMI data, this should be clearly stated and methodologically justified. Averaging observations reduces variance and effectively increases their weight in the cost function—this is not equivalent to joint multi-satellite assimilation. This distinction must be clarified and its implications explicitly discussed.

Lack of Bias Correction for Satellite Data (Major)

The study does not apply bias correction across satellite datasets, which is a critical omission. HCHO retrievals from OMPS and TROPOMI differ due to varying retrieval algorithms, cloud screening, and a priori assumptions. These systematic differences must be addressed before assimilation. Previous studies (e.g., Zhu et al., 2020; Müller et al., 2024) have shown the importance of bias correction using independent datasets such as aircraft or FTIR observations. At minimum, the authors should:

Justify the omission of bias correction

Discuss associated uncertainties

Provide quantitative comparisons between satellite datasets prior to assimilation (with figures in the main text)

Display and discuss the observation uncertainties used in the assimilation

Unrealistic Assumptions for Emission Uncertainty (Major)

The manuscript assumes a uniform 100% random uncertainty for all emission sectors and species. This is overly simplistic and not representative of known variability—biogenic and biomass burning emissions typically carry much greater uncertainty than anthropogenic sources. Furthermore, the spatial correlation structure of errors and the regularization approach are not well described. These assumptions critically affect the inversion and should be better supported by literature references, sensitivity tests, or at minimum, a comprehensive uncertainty discussion.

Inversion Framework and Terminology (Major)

The manuscript describes the method as 4DEnVar, yet no ensemble component appears to be used. The inversion resembles a standard 4D-Var framework. If an ensemble is not implemented, the use of "EnVar" terminology is misleading and should be corrected. If an ensemble is used, key details are missing, including ensemble generation, localization, hybrid covariance structures, etc. Additionally, the manuscript does not explain:

The optimization method used to minimize the cost function

Convergence criteria and number of iterations

Use and selection of regularization

Whether the GEOS-Chem adjoint model is used, and how it is implemented

Incomplete Statistical Evaluation of Results (Major)

The validation of the inversion results relies solely on RMSE. A more complete suite of statistical metrics is needed, including correlation coefficient, bias, normalized mean bias (NMB), and potentially others. This will allow for a more comprehensive understanding of model performance and assimilation impact.

Insufficient Discussion of Scientific Implications (Major)

The target year, 2020, was heavily influenced by COVID-19-related emission reductions. This critical context is not introduced in the manuscript and must be incorporated into both the introduction and discussion sections. Specifically:

Why was 2020 chosen for the inversion?

How do inversion results indicating underestimation in prior emissions reconcile with pandemic-related expectations of reduced emissions?

What implications do the findings have for air quality modeling or emission policy evaluation?

Specific Comments
Abstract

(Minor) Clarify whether the assimilation used OMPS only or both OMPS and TROPOMI. Identify which dataset(s) are considered "independent" validation.

(Minor) Define acronyms such as “NCP” (North China Plain) and explicitly mention the study year (2020).

(Minor) The statement “validated through comparisons against the independent satellite measurements and the surface ozone measurements” should specify which satellite and ozone datasets were used and what “validated” means quantitatively.

Introduction

(Major) Provide more detail on bottom-up NMVOC emission uncertainties by sector (anthropogenic, biogenic, biomass burning).

(Major) Expand the literature review of top-down VOC inversions. Important studies using various methods (e.g., Martin et al., 2003; Wells et al., 2020, 2022; Choi et al., 2022; Cao et al., 2018; Müller et al., 2024) are missing.

(Minor) p2, l2: Add a supporting reference for "became the major source region globally."

(Minor) p2, l9: Include reference to biomass burning inventories.

(Minor) p2, l13: Mention both emission factors and activity data.

(Minor) p2, l21–24: Include references for VOC measurement techniques.

(Minor) p2, l30–p3, l2: The discussion of glyoxal is unnecessary as it is not used in the study—suggest removing.

Methods

(Minor) p4, l10: Remove the word "sources".

(Minor) p6, l4: Clarify what is meant by biogenic emissions being the main source—this may not apply to NCP.

(Minor) p6, l10: The claim about biogenic dominance is inconsistent with the previous sentence. Please reconcile.

(Major) Section 2.3: Filtering criteria for OMPS and TROPOMI should be clearly described. Why are negative values removed only for TROPOMI? What thresholds are used for high outliers? What is the sensitivity to these choices?

(Major) Section 2.6: Provide full details on the inversion algorithm, adjoint model (if used), regularization, convergence, and assimilation setup for multiple satellite datasets.

(Minor) p9, l2–5: Add references for each cited method.

(Minor) p9, l14: Add publication year for Souri et al.

(Minor) p9, l15: Replace "superiority" with a specific performance attribute (e.g., lower noise, finer resolution).

Discussion and Results

(Major) Begin the discussion by comparing OMPS and TROPOMI retrievals pre-assimilation. Quantify differences and their potential impact.

(Major) Clarify whether the system constrains species and sectors independently. If so, discuss implications for chemical speciation and whether the results are physically plausible.

(Major) Provide figures on satellite retrieval uncertainty and error budgets in the main text—not just the supplement.

(Major) Discuss the impact of COVID-19 on emissions in 2020 and how it relates to your findings.

Minor Editorial Comments

Define all acronyms at first use (e.g., NCP, MEIC, CEDS).

Ensure units, abbreviations, and mathematical notations are consistently applied.

Review manuscript for grammar, sentence clarity, and fluency.
Citation: https://doi.org/10.5194/egusphere-2025-140-RC1
- AC1: 'Reply on RC1', Jianbing Jin, 22 Sep 2025
  
  We would like to thank the referee for the careful review and suggestion, which helps us to significantly improve the quality of the manuscript. We sincerely hope these revisions are able to address the reviewer’s concerns. Please find out the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-140-AC1
RC2:
'Comment on egusphere-2025-140', Anonymous Referee #1, 27 Jun 2025

This article presents an inverse modelling study of NMVOC emissions over China using the GEOS-Chem model and formaldehyde observations from OMPS and TROPOMI. The paper also compares and evaluates GEOS-Chem, OMI, OMPS and TROPOMI formaldehyde distributions through a semi-varigram analysis. The emission optimization results in a substantial enhancement of NMVOC emissions from all sectors over China. Although the topic is interesting and well within the scope of this journal, the results appear to be unreliable, due to several major methodological issues, as detailed below.
1) The satellite data are not properly used. An important issue is filtering. Let's begin with OMI. Figure 3 shows very high columns over Tibet (~2E16 molec cm^-2), which are clearly impossible. This feature is not seen in other studies using OMI, e.g. Cao et al. 2018 (also using SAO OMI, ) and Muller et al. 2024 (). Although OMI filtering information is missing in this manuscript, I strongly suspect that negative columns are filtered out, leading to strong positive bias in the averaged columns. The effect is most prominent in regions with low columns and high uncertainties (like Tibet), but it affects all regions. Filtering of negative values is also done by Xu et al. for TROPOMI data (see Sect. 2.3.2), which also leads to overestimation, although at a smaller extent compared to OMI, simply because TROPOMI is less noisy. Regarding OMPS, the authors claim that they "filtered out data points where the product of formaldehyde columns and three times the observation uncertainty was less than zero" (Sect. 2.3.1). This is very strange. I suppose that they meant "the sum", not the product. In any case, the filtering likely causes a positive bias.
2) The optimization of emissions relies on the comparison of monthly averaged modelled and satellite columns. However, the satellite average excludes cloudy pixels, whereas the model average does not. Including the cloudy days in the model averages causes a negative bias with respect to the satellite averages. Furthermore, it is not even stated whether the model columns are sampled at the satellite overpass time. This should be clarified. Finally, the manuscript makes no mention of averaging kernels (or scattering weights). Applying averaging kernels to the model profiles is essential to minimize the effect of vertical profile shapes between your model and the profiles adopted in the satellite retrieval.
3) TROPOMI and OMI HCHO products have significant biases -- see Zhu et al. (), Vigouroux et al. 2020 (), Oomen et al. 2024 (), Muller et al. 2024 (https://doi.org/10.5194/acp-24-2207-2024). I am not sure whether OMPS data were similarly evaluated. In any case, the biases should be either corrected for, or discussed within the manuscript, as well as the potential implications for the emissions.

4) The methodology is not well described. For example:

- no detail is provided on the implementation of MEGAN, besides the fact that the emissions are calculated off-line. What is their temporal resolution, what meteorological fields are used, what vegetation maps and emission factors, etc.?

- the description of OMI data is too short

- the motivation and added value of the semi-variogram analysis is not clear. No surprise that OMI data are revealed to be more noisy than the other datasets, since year 2020 is used here, >15 years after the launch of OMI.

- more information is needed to explain the details of how emissions are really optimized. The assumed uncertainty on the prior emissions should be given and justified.

- it is impossible to understand some sentences, for example (l. 14 on page 10) "A small l means more errors in fine scale could be resolved using the assimilation, while however requires more ensemble runs to represent the model realization from emission to simulation". Please clarify.

- the representativity error is taken as the standard deviation of the columns around their monthly means: see remark above on the temporal sampling issue. This error can and should be taken care of through appropriate sampling of model concentrations.

- from Sect. 2.3.1, it would seem that geometric air mass factors are used for the OMPS retrieval, which is very strange since the retrieval described by Nowlan et al. 2023 () incorporates a detailed AMF calculation. This should be clarified. If really geometric AMFs are being used, the product would be inappropriate for emission optimization.

5) The results are insufficiently discussed and validated against independent datasets such as ground-based HCHO or VOC concentration data and flux measurements.

For the reasons above, the manuscript cannot be accepted for publication in its current form. The entire emission optimization would have to be remade with appropriate processing of satellite data and model concentrations. I would recommend to drop the semi-variogram analysis and the OMI data, at least if 2020 is still the only year addressed by the study. The model and the assimilation method need to be better explained. Finally, Figure 1 and many others have insets with the South China Sea islands. This is useless, since the colors cannot be distinguished, the islands being too small. Pushing a territorial claim on disputed islands is clearly inappropriate in this journal.

Citation: https://doi.org/10.5194/egusphere-2025-140-RC2
- AC2: 'Reply on RC2', Jianbing Jin, 22 Sep 2025
  
  We would like to thank the referee for the careful review and suggestion, which helps us to significantly improve the quality of the manuscript. We sincerely hope these revisions are able to address the reviewer’s concerns. Please find out the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-140-AC2

Canjie Xu, Jianbing Jin, Ke Li, Yinfei Qi, Ji Xia, Hai Xiang Lin, and Hong Liao

Supplement

https://doi.org/10.5194/egusphere-2025-140-supplement

Canjie Xu, Jianbing Jin, Ke Li, Yinfei Qi, Ji Xia, Hai Xiang Lin, and Hong Liao

Viewed

Total article views: 919 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
815	77	27	919	37	36	47

HTML: 815
PDF: 77
XML: 27
Total: 919
Supplement: 37
BibTeX: 36
EndNote: 47

Views and downloads (calculated since 07 May 2025)

Month	HTML	PDF	XML	Total
May 2025	112	26	10	148
Jun 2025	78	12	3	93
Jul 2025	46	9	0	55
Aug 2025	102	6	1	109
Sep 2025	435	11	12	458
Oct 2025	38	11	1	50
Nov 2025	4	2	0	6

Cumulative views and downloads (calculated since 07 May 2025)

Month	HTML	PDF	XML	Total
May 2025	112	26	10	148
Jun 2025	78	12	3	93
Jul 2025	46	9	0	55
Aug 2025	102	6	1	109
Sep 2025	435	11	12	458
Oct 2025	38	11	1	50
Nov 2025	4	2	0	6

Viewed (geographical distribution)

Total article views: 954 (including HTML, PDF, and XML) Thereof 954 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Nov 2025

Short summary

This study optimizes non-methane volatile organic compound (NMVOC) emissions in China using satellite formaldehyde retrievals. A semi-variogram spatial analysis demonstrated the advantages of TROPOMI and OMPS products over the conventional OMI. Emission inversion was applied, resulting in better emission inventories. The optimized results significantly enhance model simulations of NMVOCs and ozone, with notable accuracy improvements nationwide, especially in polluted regions like NCP.


Total:	0
HTML:	0
PDF:	0
XML:	0