A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument

Oak, Yujin J.; Jacob, Daniel J.; Balasus, Nicholas; Yang, Laura H.; Chong, Heesung; Park, Junsung; Lee, Hanlim; Lee, Gitaek T.; Ha, Eunjo S.; Park, Rokjin J.; Kwon, Hyeong-Ahn; Kim, Jhoon

doi:https://doi.org/10.5194/egusphere-2024-393

Preprints

https://doi.org/10.5194/egusphere-2024-393

Preprints

04 Apr 2024

| 04 Apr 2024

A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument

Yujin J. Oak, Daniel J. Jacob, Nicholas Balasus, Laura H. Yang, Heesung Chong, Junsung Park, Hanlim Lee, Gitaek T. Lee, Eunjo S. Ha, Rokjin J. Park, Hyeong-Ahn Kwon, and Jhoon Kim

Abstract. The Geostationary Environment Monitoring Spectrometer (GEMS) launched in February 2020 is now providing continuous daytime hourly observations of nitrogen dioxide (NO₂) columns over East Asia (5° S–45° N, 75° E–145° E) with 3.5 × 7.7 km² pixel resolution. These data provide unique information to improve understanding of the sources, chemistry, and transport of nitrogen oxides (NO_x) with implications for atmospheric chemistry and air quality, but opportunities for direct validation are very limited. Here we correct the operational level-2 (L2) NO₂ vertical column densities (VCDs) from GEMS with a machine learning (ML) model to match the much sparser but more mature observations from the low Earth orbit TROPOspheric Monitoring Instrument (TROPOMI), preserving the data density of GEMS but making them consistent with TROPOMI. We first reprocess the GEMS and TROPOMI operational L2 products to use common prior vertical NO₂ profiles (shape factors) from the GEOS-Chem chemical transport model. This removes a major inconsistency between the two satellite products and greatly improves their agreement with ground-based Pandora NO₂ VCD data in source regions. We then apply the ML model to correct the remaining differences, Δ(GEMS-TROPOMI), using as predictor variables the GEMS NO₂ VCDs and retrieval parameters. We train the ML model with collocated GEMS and TROPOMI NO₂ VCDs, taking advantage of TROPOMI off-track viewing to cover a wide range of effective zenith angles (EZAs) for the GEMS diurnal profiles. The two most important predictor variables for Δ(GEMS-TROPOMI) are GEMS NO₂ VCD and EZA. The corrected GEMS product is unbiased relative to TROPOMI and shows a diurnal variation over source regions more consistent with Pandora than the operational product.

Received: 09 Feb 2024 – Discussion started: 04 Apr 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 4045 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (4045 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

05 Sep 2024

A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument

Yujin J. Oak, Daniel J. Jacob, Nicholas Balasus, Laura H. Yang, Heesung Chong, Junsung Park, Hanlim Lee, Gitaek T. Lee, Eunjo S. Ha, Rokjin J. Park, Hyeong-Ahn Kwon, and Jhoon Kim

Atmos. Meas. Tech., 17, 5147–5159, https://doi.org/10.5194/amt-17-5147-2024,https://doi.org/10.5194/amt-17-5147-2024, 2024

Short summary

Yujin J. Oak, Daniel J. Jacob, Nicholas Balasus, Laura H. Yang, Heesung Chong, Junsung Park, Hanlim Lee, Gitaek T. Lee, Eunjo S. Ha, Rokjin J. Park, Hyeong-Ahn Kwon, and Jhoon Kim

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-393', Anonymous Referee #1, 10 Apr 2024

Review of “A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument” (EGUsphere-2024-393) by Oak et al.
Recommendation: Minor Revision
Summary Statement: This article demonstrates a machine learning model can help to reduce the bias of GEMS geostationary satellite product of nitrogen dioxide compared to the TROPOMI product. This manuscript is well-written and presents a clear and concise approach to obtain bias-corrected GEMS product.
One concern is about the paragraph discussing the SHAP results (lines 195-200). While the contribution of different input variables to the model's performance is an important aspect, I would recommend delaying this discussion until after the performance of the ML model itself has been addressed. This would allow the reader to understand the overall effectiveness of the model before delving into the details of model and specific input variables.
Minor comment:
Line 248: It’s hard to understand the meaning of “ML correction increases the ocean background” Do you mean that ML correction increases product population over the ocean?

Citation: https://doi.org/10.5194/egusphere-2024-393-RC1
- AC1: 'Reply on RC1', Yujin J. Oak, 16 May 2024
  
  1. One concern is about the paragraph discussing the SHAP results (lines 195-200). While the contribution of different input variables to the model's performance is an important aspect, I would recommend delaying this discussion until after the performance of the ML model itself has been addressed. This would allow the reader to understand the overall effectiveness of the model before delving into the details of model and specific input variables.
  We moved the paragraph to lines 220-229, after evaluating the model’s performance, and switched the order of figures accordingly.
  
  2. Line 248: It’s hard to understand the meaning of “ML correction increases the ocean background” Do you mean that ML correction increases product population over the ocean?
  We clarified the sentence to (lines 207-209):
  “…GEMS product increases VCDs in the remote ocean background in the southeastern part of the GEMS scan domain by up to 200% and decreases VCDs in Central Asia by up to 40%, regardless of season.”
  And also in lines 255-256:
  “ML correction increases VCDs in the remote ocean regions by up to 200% and decreases VCDs in Central Asia by up to 40%.”
  
  Citation: https://doi.org/10.5194/egusphere-2024-393-AC1
RC2:
'Comment on egusphere-2024-393', Anonymous Referee #2, 13 May 2024

Oak et al. present an interesting study in which they compare GEMS and TROPOMI NO2 total vertical columns. They find that by recalculating the AMF using consistent GEOS-CHEM profiles for GEMS and TROPOMI, the differences between GEMS and TROPOMI columns are greatly reduced. Furthermore, the comparison with PANDORA data is also improved by this step, both for TROPOMI and GEMS. Finally, they use a ML model to further improve the agreement between GEMS and TROPOMI columns. Their work shows how TROPOMI data can be correctly use as a transfer between the different geostationary instruments.
The results are clearly and honestly presented. It is appreciated that the comments addressed during the quick report have been included. I recommend publication after minor revisions. I would like to read more details/discussion about the points below.
1/ It is an interesting result to show that the main reason for the differences between GEMS and TROPOMI NO2 VCD lies in the AMF calculation. The relatively good agreement between the reprocessed columns shows that the NO2 SCD retrieval are consistent. Concerning the GEMS NO2 AMF, since the AK are taken from the GEMS CHOCHO L2 product, we cannot exclude another issue than a wrong use of the GEOS-Chem vertical coordinates.
P6, line 166: “much of the discrepancy in the L2 products stem from different vertical shape factors”. Please remind the reader that a large part could also come from an incorrect use of the vertical coordinates in the GEMS NO2 operational product.
2/ It is not shown that the ML model improves the diurnal variation comparison with the PANDORA (mainly from Figure 3). There is no evidence that including the TROPOMI VZA up to 50° actually helps to “build an ML model relevant to GEMS observations at different times of day”, as stated p6, line 185, in the abstract and in the conclusions. Please comment on the possibility to further improve the GEMS diurnal variation using ML technique.
Related to this point, it is not clear why the diurnal variation is more affected by the ML model during warm months than during cold months. I expect larger angles during cold months, and therefore a larger correction. Maybe it is because the days are longer during warm days?
3/ Figure 2: The GEMS NO2 columns seem to be cut for negative values. Is it an effect of the GEMS quality flags? This cutting effect seems to be amplified by the reprocessing and ML correction steps. The correlation is degraded from step 1 to 2. Have you tried to apply an improved quality filtering for GEMS? Or would it make sense to filter negative TROPOMI columns as well?
4/ Figure 3a: The corrected GEMS columns do agree better with PANDORA than the reprocessed GEMS columns. However, it is not obvious that they agree better with the reprocessed TROPOMI columns (it is the case for Jan/Feb, but not in May or June). It looks like the ML model tend to decrease the GEMS columns but has difficulties to increase them even when there is a negative difference with TROPOMI. Can you comment on this?
5/ Legend of figure 3: It should be mentioned explicitly that all the NO2 columns are total VCD, including the PANDORA columns.

Citation: https://doi.org/10.5194/egusphere-2024-393-RC2
- AC2: 'Reply on RC2', Yujin J. Oak, 16 May 2024
  
  We have addressed these comments and responses can be found in the colored fonts in the pdf file attached.
  
  Citation: https://doi.org/10.5194/egusphere-2024-393-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-393', Anonymous Referee #1, 10 Apr 2024

Review of “A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument” (EGUsphere-2024-393) by Oak et al.
Recommendation: Minor Revision
Summary Statement: This article demonstrates a machine learning model can help to reduce the bias of GEMS geostationary satellite product of nitrogen dioxide compared to the TROPOMI product. This manuscript is well-written and presents a clear and concise approach to obtain bias-corrected GEMS product.
One concern is about the paragraph discussing the SHAP results (lines 195-200). While the contribution of different input variables to the model's performance is an important aspect, I would recommend delaying this discussion until after the performance of the ML model itself has been addressed. This would allow the reader to understand the overall effectiveness of the model before delving into the details of model and specific input variables.
Minor comment:
Line 248: It’s hard to understand the meaning of “ML correction increases the ocean background” Do you mean that ML correction increases product population over the ocean?

Citation: https://doi.org/10.5194/egusphere-2024-393-RC1
- AC1: 'Reply on RC1', Yujin J. Oak, 16 May 2024
  
  1. One concern is about the paragraph discussing the SHAP results (lines 195-200). While the contribution of different input variables to the model's performance is an important aspect, I would recommend delaying this discussion until after the performance of the ML model itself has been addressed. This would allow the reader to understand the overall effectiveness of the model before delving into the details of model and specific input variables.
  We moved the paragraph to lines 220-229, after evaluating the model’s performance, and switched the order of figures accordingly.
  
  2. Line 248: It’s hard to understand the meaning of “ML correction increases the ocean background” Do you mean that ML correction increases product population over the ocean?
  We clarified the sentence to (lines 207-209):
  “…GEMS product increases VCDs in the remote ocean background in the southeastern part of the GEMS scan domain by up to 200% and decreases VCDs in Central Asia by up to 40%, regardless of season.”
  And also in lines 255-256:
  “ML correction increases VCDs in the remote ocean regions by up to 200% and decreases VCDs in Central Asia by up to 40%.”
  
  Citation: https://doi.org/10.5194/egusphere-2024-393-AC1
RC2:
'Comment on egusphere-2024-393', Anonymous Referee #2, 13 May 2024

Oak et al. present an interesting study in which they compare GEMS and TROPOMI NO2 total vertical columns. They find that by recalculating the AMF using consistent GEOS-CHEM profiles for GEMS and TROPOMI, the differences between GEMS and TROPOMI columns are greatly reduced. Furthermore, the comparison with PANDORA data is also improved by this step, both for TROPOMI and GEMS. Finally, they use a ML model to further improve the agreement between GEMS and TROPOMI columns. Their work shows how TROPOMI data can be correctly use as a transfer between the different geostationary instruments.
The results are clearly and honestly presented. It is appreciated that the comments addressed during the quick report have been included. I recommend publication after minor revisions. I would like to read more details/discussion about the points below.
1/ It is an interesting result to show that the main reason for the differences between GEMS and TROPOMI NO2 VCD lies in the AMF calculation. The relatively good agreement between the reprocessed columns shows that the NO2 SCD retrieval are consistent. Concerning the GEMS NO2 AMF, since the AK are taken from the GEMS CHOCHO L2 product, we cannot exclude another issue than a wrong use of the GEOS-Chem vertical coordinates.
P6, line 166: “much of the discrepancy in the L2 products stem from different vertical shape factors”. Please remind the reader that a large part could also come from an incorrect use of the vertical coordinates in the GEMS NO2 operational product.
2/ It is not shown that the ML model improves the diurnal variation comparison with the PANDORA (mainly from Figure 3). There is no evidence that including the TROPOMI VZA up to 50° actually helps to “build an ML model relevant to GEMS observations at different times of day”, as stated p6, line 185, in the abstract and in the conclusions. Please comment on the possibility to further improve the GEMS diurnal variation using ML technique.
Related to this point, it is not clear why the diurnal variation is more affected by the ML model during warm months than during cold months. I expect larger angles during cold months, and therefore a larger correction. Maybe it is because the days are longer during warm days?
3/ Figure 2: The GEMS NO2 columns seem to be cut for negative values. Is it an effect of the GEMS quality flags? This cutting effect seems to be amplified by the reprocessing and ML correction steps. The correlation is degraded from step 1 to 2. Have you tried to apply an improved quality filtering for GEMS? Or would it make sense to filter negative TROPOMI columns as well?
4/ Figure 3a: The corrected GEMS columns do agree better with PANDORA than the reprocessed GEMS columns. However, it is not obvious that they agree better with the reprocessed TROPOMI columns (it is the case for Jan/Feb, but not in May or June). It looks like the ML model tend to decrease the GEMS columns but has difficulties to increase them even when there is a negative difference with TROPOMI. Can you comment on this?
5/ Legend of figure 3: It should be mentioned explicitly that all the NO2 columns are total VCD, including the PANDORA columns.

Citation: https://doi.org/10.5194/egusphere-2024-393-RC2
- AC2: 'Reply on RC2', Yujin J. Oak, 16 May 2024
  
  We have addressed these comments and responses can be found in the colored fonts in the pdf file attached.
  
  Citation: https://doi.org/10.5194/egusphere-2024-393-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Yujin J. Oak on behalf of the Authors (16 May 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (20 May 2024) by MH AHN

RR by Anonymous Referee #3 (10 Jul 2024)

ED: Publish subject to technical corrections (18 Jul 2024) by MH AHN

AR by Yujin J. Oak on behalf of the Authors (18 Jul 2024) Author's response Manuscript

Journal article(s) based on this preprint

05 Sep 2024

A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument

Yujin J. Oak, Daniel J. Jacob, Nicholas Balasus, Laura H. Yang, Heesung Chong, Junsung Park, Hanlim Lee, Gitaek T. Lee, Eunjo S. Ha, Rokjin J. Park, Hyeong-Ahn Kwon, and Jhoon Kim

Atmos. Meas. Tech., 17, 5147–5159, https://doi.org/10.5194/amt-17-5147-2024,https://doi.org/10.5194/amt-17-5147-2024, 2024

Short summary

Yujin J. Oak, Daniel J. Jacob, Nicholas Balasus, Laura H. Yang, Heesung Chong, Junsung Park, Hanlim Lee, Gitaek T. Lee, Eunjo S. Ha, Rokjin J. Park, Hyeong-Ahn Kwon, and Jhoon Kim

Viewed

Total article views: 612 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
436	144	32	612	19	29

HTML: 436
PDF: 144
XML: 32
Total: 612
BibTeX: 19
EndNote: 29

Views and downloads (calculated since 04 Apr 2024)

Month	HTML	PDF	XML	Total
Apr 2024	178	59	10	247
May 2024	104	39	9	152
Jun 2024	64	13	6	83
Jul 2024	38	19	4	61
Aug 2024	46	12	3	61
Sep 2024	6	2	0	8

Cumulative views and downloads (calculated since 04 Apr 2024)

Month	HTML	PDF	XML	Total
Apr 2024	178	59	10	247
May 2024	104	39	9	152
Jun 2024	64	13	6	83
Jul 2024	38	19	4	61
Aug 2024	46	12	3	61
Sep 2024	6	2	0	8

Viewed (geographical distribution)

Total article views: 608 (including HTML, PDF, and XML) Thereof 608 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 09 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (4045 KB)
Metadata XML

Short summary

We present an improved NO₂ product from GEMS by calibrating it to TROPOMI using machine learning and by reprocessing both satellite products to adopt common NO₂ profiles. Our corrected GEMS product combines the high data density of GEMS with the accuracy of TROPOMI, supporting the combined use for analyses of East Asia air quality including emissions and chemistry. This method can be extended to other species and geostationary satellites including TEMPO and Sentinel-4.

A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)

Cited

1 citations as recorded by crossref.


Total:	0
HTML:	0
PDF:	0
XML:	0