the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Development and validation of satellite-derived surface NO2 estimates using machine learning versus traditional approaches in North America
Abstract. Nitrogen dioxide (NO2) is one of the key pollutants with profound implications for air quality, and human health, and is needed to establish the air quality health index (AQHI). Currently, over 600 surface air monitoring stations are distributed across Canada and the United States measuring NO2, but many areas remain unmonitored leading to incomplete information for health risk assessments. This study leverages Tropospheric Monitoring Instrument (TROPOMI) satellite observations and machine learning models to derive high-resolution surface NO2 concentrations, provides enhanced spatial coverage and accuracy, revealing urban-rural NO2 gradients across North America. Existing traditional methods rely on scaling with modeled profiles to obtain NO2 surface concentrations from satellite observations. Here, we compare this traditional method to a machine learning approach that utilizes NO2 observations from TROPOMI, together with meteorological parameters, land cover type, topography, and emission inventories. Our results show that the machine learning (using random forest) yields less bias between the surface monitoring measurements and the "satellite-derived" surface concentrations, significantly improved the correlation coefficient (R2~0.77–0.91) compared to the traditional method (R2~0.39–0.57) and yields to significantly less bias.
- Preprint
(2743 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-1681', Anonymous Referee #1, 09 May 2025
-
RC2: 'Comment on egusphere-2025-1681', Fei Liu, 29 Jul 2025
The authors develop a machine learning algorithm to infer surface NO2. The manuscript is well-written, and the results appear robust. I recommend the paper for publication after minor revisions.
General comments:
- The authors state, “Currently, to our knowledge, there is little machine learning done to derive surface concentrations in less populated areas such as Canada.” While this points out the motivation of the work, it would be helpful to expand on why this is significant. What differences between more populated regions (e.g., China, Germany) and less populated regions (e.g., Canada) would justify the need for this study? This additional context could better motivate the work.
- Section 2.4.2. I found it surprising that, in addition to NOₓ emissions, SO₂ and NH₃ emissions were included as predictors. Have you tested the sensitivity of your model to these additional species? It would be worthwhile to elaborate on why these variables are expected to play a role in predicting surface NO₂. In Figure 3, I observe that NO₂ emissions are far more significant compared to other parameters, including TROPOMI NO₂. This observation seems inconsistent with the conclusion that “This is a feature that we explicitly wanted to see in the random forest prediction, as this means the random forest function is primarily driven by the satellite measurements of NO₂.” Clarifying this potential discrepancy would strengthen the argument.
- The authors mention that NO2 surface concentrations from the GEM-MACH model were used to augment the training dataset in remote areas. Would assigning greater weight to rural monitoring stations achieve a similar effect? If this has not been tested, it would be worthwhile to evaluate whether such an approach could improve the model's predictions for less populated regions.
Specific comments:
- line 170. 1e14 shall be superscript.
Citation: https://doi.org/10.5194/egusphere-2025-1681-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
753 | 87 | 12 | 852 | 13 | 34 |
- HTML: 753
- PDF: 87
- XML: 12
- Total: 852
- BibTeX: 13
- EndNote: 34
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Please see the attached document for the comments.