Near real-time inversion of high-resolution anthropogenic carbon emissions in the Pearl River Delta region based on the four-dimensional local ensemble transform Kalman filter

Wang, Yike; Cheng, Yueming; Mai, Boru; Dai, Tie; Deng, Xuejiao; Deng, Tao; Zhao, Xiaoli; Diao, Yiwei; Xia, Feng; Liang, Miao; Li, Ying; Zhu, Yixiao

doi:10.5194/egusphere-2025-6272

Preprints

https://doi.org/10.5194/egusphere-2025-6272

Preprints

31 Dec 2025

| 31 Dec 2025

Near real-time inversion of high-resolution anthropogenic carbon emissions in the Pearl River Delta region based on the four-dimensional local ensemble transform Kalman filter

Yike Wang, Yueming Cheng, Boru Mai, Tie Dai, Xuejiao Deng, Tao Deng, Xiaoli Zhao, Yiwei Diao, Feng Xia, Miao Liang, Ying Li, and Yixiao Zhu

Abstract. For climate mitigation, it is necessary to address the dynamic updating and assessment of CO₂ emissions at regional scales. This study developed a kilometer-scale carbon assimilation system (the Guangzhou Regional Atmospheric Composition and Environment Forecasting System–Greenhouse Gas–Data Assimilation, GRACES-GHG-DA) by coupling the weather research and forecasting–greenhouse gas (WRF-GHG) model with the four-dimensional local ensemble transform Kalman filter (4D-LETKF). GRACES-GHG-DA constructs a near-real-time 4-km anthropogenic emission inventory, constrained by simulated CO₂ observation data from seven high-precision greenhouse gas monitoring stations in the Pearl River Delta (PRD) region, to analyze spatiotemporal emission distributions and their relationship with ambient CO₂ concentrations. The results indicate that: (1) GRACES-GHG-DA accurately downscales CO₂ concentrations from a resolution of 36 to 4 km, with the finer resolution better capturing meso- and micro-scale variations (hourly and monthly mean biases of −0.77 and −0.51 ppm, respectively). (2) In 2022, the inverted annual anthropogenic CO₂ flux in core PRD areas exceeded 7500 g C m⁻² a⁻¹, contrasting with values below 1000 g C m⁻² a⁻¹ in peripheral regions. Compared to the inversion estimates, statistical inventories (EDGAR, ODIAC, GCP, and MEIC) underestimated total emissions by 14.71% on average. (3) Seasonal anthropogenic emissions were 24.03, 29.86, 30.61, and 27.26 Tg C for spring, summer, autumn, and winter, respectively, showing a unimodal diurnal pattern largely influenced by fossil-fuel electricity generation.(4) Anthropogenic emissions are not the dominant factor governing atmospheric CO₂ concentrations in the PRD; vegetation carbon uptake/release, boundary layer evolution, and regional transport also play critical roles.

Received: 16 Dec 2025 – Discussion started: 31 Dec 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2790 KB)

Supplement (1726 KB)

Download & links

Yike Wang, Yueming Cheng, Boru Mai, Tie Dai, Xuejiao Deng, Tao Deng, Xiaoli Zhao, Yiwei Diao, Feng Xia, Miao Liang, Ying Li, and Yixiao Zhu

Status: final response (author comments only)

CC1: 'Comment on egusphere-2025-6272', Nima Zafarmomen, 02 Jan 2026

This paper presents a sophisticated approach to carbon cycle science by developing the GRACES-GHG-DA system. It bridges the gap between mesoscale meteorological modeling (WRF-GHG) and advanced data assimilation (4D-LETKF) to achieve kilometer-scale, hourly updates for anthropogenic CO₂ emissions in one of the world's most complex urban clusters: the Pearl River Delta (PRD). The novelty of this research lies in its spatiotemporal granularity and the application of 4D-LETKF for near real-time inversion. While global and regional models often struggle with the "representation error" of urban environments, this study successfully downscales to a 4-km resolution, allowing for the identification of meso- and micro-scale variations that coarser models (36 km) overlook.
The use of 4D-LETKF represents a meaningful advancement over more commonly applied 3D EnKF or EnSRF approaches, especially in the assimilation of asynchronous hourly observations. However, the manuscript would benefit from a clearer articulation of its novelty relative to previous WRF-GHG-based regional inversion studies. While the technical improvements are evident from the results, explicitly highlighting how the near–real-time, hourly-updated anthropogenic emission inversion and the kilometer-scale resolution extend beyond prior work would strengthen the paper’s positioning.
The comparison between the top-down inversion and bottom-up inventories is one of the manuscript’s strengths. The spatial patterns of disagreement are well analyzed, particularly in the PRD urban core. However, the discussion could be deepened by offering more interpretation of why certain inventories, especially MEIC, show large underestimations in core cities. Even a qualitative sectoral explanation would help contextualize these differences and enhance the relevance for inventory developers and policymakers.
To strengthen the discussion on urban emission heterogeneity and the challenges of capturing mobile source contributions (specifically mentioned in Section 3.3 regarding traffic emissions), I strongly suggest that the authors cite the following paper:
Comprehensive spatiotemporal analysis of long-term mobile monitoring for traffic-related particles in a complex urban environment. > DOI: 10.1016/j.apr.2025.102870

Citation: https://doi.org/10.5194/egusphere-2025-6272-CC1
RC1:
'Comment on egusphere-2025-6272', Anonymous Referee #1, 26 Jan 2026
General assessment

Overall, this is an interesting and potentially valuable study. The authors couple the widely used WRF-GHG model with a four-dimensional local ensemble transform Kalman filter (4D-LETKF) to estimate anthropogenic CO₂ emissions in the Pearl River Delta (PRD) region for the year 2022, using in situ CO₂ observations from seven high-precision monitoring stations. The ability to derive top-down CO₂ emission estimates for a large and highly urbanized region such as the PRD is highly relevant for climate mitigation efforts and provides an important independent validation of bottom-up emission inventories.

The chosen methodological framework is, in principle, well suited for addressing this problem. WRF-GHG, coupled with VPRM, and embedded in a 4D-LETKF inversion framework, represents a state-of-the-art approach for regional carbon flux estimation. The results show systematic upward adjustments of emissions in the urban core, particularly around Shenzhen and Guangzhou, which lead to correspondingly higher simulated CO₂ concentrations. The inferred seasonal cycle, with higher emissions in summer and autumn and lowest emissions in spring, is plausible for the PRD region and can reasonably be linked to variations in energy demand, in particular electricity generation for cooling. The results are presented from multiple perspectives, including comparisons with several emission inventories and analyses of modeled CO₂ concentrations.

However, in my view the study has a major weakness: it lacks a rigorous evaluation of the CO₂ simulations and of the inversion system itself. Modeling atmospheric CO₂ at regional scales is inherently challenging, and regional inversions are particularly susceptible to systematic biases in emission estimates. These challenges are well known in the community, and considerable effort is currently made to address them. While the authors briefly acknowledge some of these issues in the Results section, no concrete evaluation or sensitivity analysis is carried out. As a result, it is difficult to assess the robustness of the inferred emission adjustments.

This concern can be summarized in three points:

First, biospheric fluxes represent a major source of uncertainty. Photosynthetic uptake and ecosystem respiration are both large fluxes that partially cancel each other, and small errors in their representation can strongly affect inferred anthropogenic emissions. In a regional inversion framework, biospheric fluxes should ideally be jointly optimized, or at least rigorously evaluated. In the present study, however, no validation of the vegetation fluxes is provided. At a minimum, a sensitivity analysis of the emission estimates with respect to the net ecosystem exchange (NEE) would be necessary to assess the impact of biospheric flux uncertainties.

Second, the treatment of background CO₂ concentrations is insufficiently constrained. While CarbonTracker products may exhibit small biases at large scales, their representation at regional scales is limited. In regional inversions, it is common practice to include an upwind background station and to simultaneously optimize background concentrations in order to avoid systematic biases in emission estimates arising from background errors. This aspect is not addressed in the current study.

Third, the assimilation of nighttime CO₂ observations (if performed) raises additional concerns. Nighttime boundary layer heights are often overestimated in atmospheric models, which can lead to systematic concentration biases. An evaluation of model performance with respect to boundary layer dynamics, particularly at night, would therefore be essential to assess the reliability of the inferred emissions.

The systematic upward correction of a posteriori CO₂ concentrations found in the study could plausibly be attributed, at least in part, to one or a combination of these three factors. This issue requires a much more thorough investigation.

Also, the term “near real-time” in the title appears misleading, as the study presents a retrospective inversion for the year 2022 rather than an operational or low-latency system. While the framework may be suitable for near-real-time applications, this is not demonstrated in the current analysis.

Finally, several crucial methodological details are missing. Critical information about the observations (e.g., sampling heights, assimilated time periods, potential rejection thresholds) and about the inversion system itself (e.g., ensemble size, spatial correlation lengths, localization strategy or cut-off radii, and assumed observation error correlation structures) are not provided. These details are essential for evaluating and interpreting the results and must be included for the study to be reproducible and scientifically assessable.

Minor / Technical Comments

S2L45–50: Consider citing ICON-ART applications for CO₂ simulations and inversions (e.g., Ponomarev et al., 2026). This reference is also relevant for S2L82–85, as it similarly addresses urban CO₂ emissions.

S4L101: A brief description of the 4D-LETKF method, its key features, and references to foundational studies (e.g., Hunt et al., Ott et al. Etc...) would help contextualize the methodology for readers unfamiliar with it.

S4L111: The phrase “inverted CO₂ concentrations” is misleading. Emissions are inferred by the inversion, not the concentrations themselves. A more appropriate term would be posterior or analyzed CO₂ concentrations.

S4L118: Please add a reference for VPRM (e.g., Mahadevan et al., 2008).

S7L179: Clarify the definition of xb . It is the ensemble mean, not the emissions; EDGAR provides the prior from which the ensemble is drawn. Also specify whether the stated uncertainty corresponds to 1σ or 2σ.

S7L180: The correlation length (and structure) used in the inversion should be stated explicitly.

S7L181: The description of Xb as a “matrix of each ensemble member” is unclear. Consider phrasing it as the ensemble perturbation matrix.

S7L187: Provide details on the observation error covariance R, including the magnitudes and the assumed correlation structure. How are they determined? Are they dependent on the prior concentrations or meteorological conditions?

S7 Sect. 2.3: Several methodological aspects remain unclear. For example, how and where was localization applied? Which observations were assimilated, including night-time measurements? What are the prior uncertainty correlation lengths and the observation error assumptions?

S8L198–199: The formulation is imprecise. Xa represents ensemble perturbations, not a state ensemble. The posterior ensemble members are reconstructed from xˉa and Xa , not by simply adding xˉa to Xa .

S8L208: The term “observation biases” is likely intended to describe residuals or innovations. Consider clarifying this.

S9L230 ff.: Provide the sampling height of the in-situ CO₂ observations, which is critical for interpreting results.

S10L263: Comparing posterior concentrations at in-situ stations with (prior) satellite XCO₂ is problematic. It’s not an apples-to-apples comparison due to fundamentally different vertical sensitivities and averaging kernels. Please clarify the rationale.

S9 Sect. 3: So far, the inversion period has not been explicitly stated. Please do so.

S9 Sect. 3: The analysis shows the inversion reproduces the assimilated observations better than the prior forward run (so what an inversion is supposed to do), but this is not a proper validation. Are there independent stations that could be used? Fit-to-obs validations, such as the chi-squared metrics, could help quantify fit quality.

S10L267–268: Strongly improved agreements with assimilated observations may reflect overfitting. Presenting chi-squared or fit-to-observation statistics would help assess the robustness of the inversion.

S12L291: The statement attributing discrepancies solely to underestimated EDGAR emissions may be oversimplified. Other factors, such as uncertainties in biospheric fluxes, background concentrations, or night-time boundary layer representation, should be tested and discussed (see my general comments).

S13L304: Correct terminology: time series rather than distributions.

S13 Fig. 6: The inversion reproduces the assimilated observations, but it is unclear how this compares to observational uncertainties. Possible overfitting should be discussed; again, independent validation or chi-squared metrics would be informative.

S14L330: Especially the influence of misrepresented nocturnal PBL height could be large and contribute to systematic biases.

S17 Fig. 9b: Increase the figure size. Inversion-EDGAR maps are particularly important, as they reflect actual emission innovations. Please indicate station locations on the maps.

S17 Fig. 9b: Inversion-Edgar: Why do we see the pattern where upward adjustments are surrounded by downward adjustments?

S17 and all other figures: Please use perceptually uniform colormaps.

S19 Fig. 11 and S22 Fig. 13a: Displaying sectoral emission contributions (traffic, energy, industry) would help interpret seasonal and diurnal emission patterns. Their discussion would strengthen the interpretation of the results.

S22L515: In addition, the winter monsoon transports air from inland regions that has already been influenced by emissions. It’s not only the higher CO2 concentrations of the air coming from higher latitudes.

Results section (general): Some descriptions of individual numerical values are lengthy but add little value beyond the figures. Consider condensing the text and focusing on interpretation and key insights.
Citation: https://doi.org/10.5194/egusphere-2025-6272-RC1
RC2: 'Comment on egusphere-2025-6272', Anonymous Referee #2, 16 Feb 2026

The manuscript "Near real-time inversion of high-resolution anthropogenic carbon emissions in the Pearl River Delta region based on the four-dimensional local ensemble transform Kalman filter" aims to document a 4D-LETKF system as a tool for the monitoring of CO2 sources and sinks. It also provides new estimates and analysis of these fluxes for the PRD region.
The study relies on an inversion configuration which can be seen, in terms of spatial extent and spatial resolution, as intermediate between regional scale CO2 inversion configurations (usually focused on CO2 natural fluxes), and urban scale CO2 inversion configurations (focused on fossil fuel emissions). This is due to the complex intertwining of urban areas in the region. Suitable analysis and discussions on the potential to monitor anthropogenic emissions and natural fluxes in such an area, and on the type of observation network and inversion configuration required to support such a monitoring would have been extremely interesting.
However, this manuscript does not meet these expectations and the study and writing do not meet the criteria for scientific publication:
The introduction is an empty nutshell, with
- two pages of overview on the CO2 inversions and on the need for regional and urban inversions, which are often meaningless, and which do not bring any useful basis for the study;
- less than half a page (l94-l105) to enter into the topic of the study and its objectives, which hardly brings food for thought; these lines hardly draw a rationale for these objectives, and these objectives look a bit sparse; so far, the authors have not spoken about the PRD
- the outline
The introduction waits for the announcement of section 3 before mentioning the PRD, and it ignores the observation network that will support the ambition to invert the fluxes in this region, as if the inversion system could do it thanks to its own properties. It does not specify how this new study of the PRD connects to the previous ones, especially to those from the same research group. Like in the abstract, the wording is often unsuitable.
Section 2 provides a theoretical framework for the inversion system and pieces of practical information on input parameters and fields. However, it forgets to provide primary indications on the inversion configuration and protocol, and their relevance with respect to the inversion problem, starting with the control vector: what do the authors control ? the anthropogenic emissions only or both the natural flux (totally ignored in the analysis of the inversion results) and the anthropogenic emissions ? it also forgets the protocol of the nesting between the 3 inversion domains (d01, d02 and d03), the set up of the statistics of uncertainties in the inversion system, of the observation vector etc. A piece of information arise from the following sections: the system seems to assimilate all hourly bins of CO2 measurements at all stations, including nighttime ones, unlike almost all CO2 inversion studies so far. Unless the authors demonstrate than they manage to do it correctly, I hardly believe that they do, and this not only questions the presentation of the inversion configuration, but also this configuration itself, and thus, the whole study and analysis.
This section 2 does not bring much more insight on the observation network (suitability for the proposed analysis, urban vs. peri-urban vs. rural stations, location of the stations with respect to the main emission and urban areas and to the typical wind and transport conditions etc.) than the introduction, dedicating less than 10 lines to them. Note that the comparison between the station locations in fig 1 and the maps of emissions in figure 9 does not seem to fit well with the labelling urban/suburban of the station given in the course of section 3 (e.g. at lines 318-319); this would have to be clarified.
The authors highlight two advantages of their inversion system: the assimilation of asynchronous observations and the assimilation of hourly observations or the control of fluxes at 1-hour temporal resolution (the text is extremely unclear at line 103). I do not see the ability to assimilate asynchronous observations as a major advance by the 4D-LETKF, since this is a standard and usual aspect of global, regional and urban scale inversion system. Actually, I hardly see how an inversion system solving for the CO2 fluxes by assimilating concentration data would work properly without such a capacity. Almost all the regional and urban scale inversion systems assimilate hourly observations. I would say that the control of fluxes at 1-hour resolution is technically feasible, albeit computationally expensive, for most of the existing regional and urban scale systems: but, if they use such a control temporal resolution, where do the authors demonstrate that they exploited it and got meaningful results out of it ?
I am surprised by the regular statement that the 4km res configuration solves for the "micro-scale" transport. In the same way, I do not understand the use of the term « near real-time » (even in the title).
Sections 3, 4 and 5 do not solve for the major concerns raised by the previous sections, and piles up problems and strange sequences of diagnostics-discussion, among which
- the general lack of rigor and clarity e.g. when providing statistics (against which dataset at lines 251-253, or using which binning at line 410 ?) or when providing legends to figures (do we see the surface concentrations in figure 8 ?)
- the paragraph on page 10 indicating that d03 provides poorer (even though similar) statistics of misfits to the assimilated observations compared to the coarser resolution d01 and d02, and stating that "The results demonstrated that data assimilation enabled GRACES-GHG-DA to accurately simulate the spatial and temporal variations of atmospheric CO2 across scales of 36 to 4 km in the PRD. The high-resolution d03 domain, in particular, captured the kilometer-scale characterization of CO2 evolution" (see also the abstract " The results indicate that:GRACES-GHG-DA accurately downscales CO2 concentrations from a resolution of 36 to 4 km, with the finer resolution better capturing meso-and micro-scale variations "). Actually, the authors never really demonstrate the asset of the 4 km resolution compared to the coarser ones, and they do not demonstrate that this resolution is high enough for their application even though urban scale inversion frameworks tend to rely on finer spatial resolutions.
- the comparison of such statistics of misfit to the observation dataset specific to this study with those from different studies with different observation datasets (including a study with satellite data) (lines 263-266)
- the analysis and discussions on the seasonal variations of the anthropogenic emissions: I am ready to trust the author’s reasoning even though the emissions look surprisingly high in autumn compared to spring but the overall sections 5.1 and 5.2 lack of reference, and we have no information on the potential temporal variations assigned to the prior estimate (EDGAR) or in the independent inventories to support all this discussion, and figure 5 gives the feeling that something goes wrong in september, when the posterior CO2 concentrations go too high, beyond the observations, at almost all the sites
- maybe the most striking diagnostic since it is given towards the end of the paper, and since it’s also included in section 6: the authors see in figure 13b (the average diurnal cycle of CO2 concentrations) a "bi-modal structure" with "peaks occurring between 05:00–07:00 and 21:00–23:00 LST" probably missing the fact that in a mean diurnal cycle plot, the curve at 23:00 on the right loops back to 0:00 on the left, so that, in the PDR (like almost everywhere) there does not seem to be a maximum at 23:00.
- the last paragraph, which comes back to basics of the CO2 diurnal variations, and which realizes that "anthropogenic emissions are not the dominant factor regulating CO2 concentrations in the study region. Our study hypothesizes that, in addition to anthropogenic carbon emissions, factors including vegetation conditions, boundary layer structure, and regional atmospheric transport may also exert important regulatory effects on CO2 concentrations." arrives far too late, because (i) such basic considerations are the basis of the inversion frameworks and configurations (ii) the whole analysis focused on the anthropogenic emissions so far (see section 3.3 for example), and ignored the potential uncertainty arising from the natural fluxes. Actually, the biases between the prior simulations and the observation across the sites often seem to be driven by the boundary conditions and the natural fluxes more than by the anthropogenic emissions, even though the authors have commented these biases in the previous sections based on considerations on the latter.
- the text regularly assumes that the anthropogenic emission estimates from the inversions are necessarily very accurate (and more accurate than the inventories) whatever the observation network and the potential sources of uncertainties (see e.g. the paragraph at line 405)
Section 6 is merely a short summary of these previous sections and does not fill the many gaps in the reasoning.
The above list of concerns is far from exhaustive, but I think that this already justifies a rejection of the manuscript. The authors should carefully rethink all aspects for such a study, from the approach to the problem of the monitoring of the CO2 fluxes in the PRD region to the diagnostics and writing.

Citation: https://doi.org/10.5194/egusphere-2025-6272-RC2

Yike Wang, Yueming Cheng, Boru Mai, Tie Dai, Xuejiao Deng, Tao Deng, Xiaoli Zhao, Yiwei Diao, Feng Xia, Miao Liang, Ying Li, and Yixiao Zhu

Supplement

https://doi.org/10.5194/egusphere-2025-6272-supplement

Yike Wang, Yueming Cheng, Boru Mai, Tie Dai, Xuejiao Deng, Tao Deng, Xiaoli Zhao, Yiwei Diao, Feng Xia, Miao Liang, Ying Li, and Yixiao Zhu

Viewed

Total article views: 454 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
306	120	28	454	43	131	132

HTML: 306
PDF: 120
XML: 28
Total: 454
Supplement: 43
BibTeX: 131
EndNote: 132

Views and downloads (calculated since 31 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	19	2	1	22
Jan 2026	226	86	20	332
Feb 2026	61	32	7	100

Cumulative views and downloads (calculated since 31 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	19	2	1	22
Jan 2026	226	86	20	332
Feb 2026	61	32	7	100

Viewed (geographical distribution)

Total article views: 448 (including HTML, PDF, and XML) Thereof 448 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 01 Mar 2026

Short summary

We developed a carbon assimilation system to construct a near-real-time 4-km anthropogenic inventory in the Pearl River Delta. We analyze spatiotemporal emission distributions, compare the inversion inventories against statistical emissions, and elucidate their relationship with ambient CO₂ concentrations. The results of this study provide a robust scientific basis for advancing the dynamic updating and quantitative evaluation of anthropogenic emissions across meso- to microscale domains.


Total:	0
HTML:	0
PDF:	0
XML:	0