A close look at using national ground stations for the statistical modeling of NO<sub>2</sub>

Boersma, Foeke; Lu, Meng

doi:10.5194/egusphere-2023-1260

Preprints

https://doi.org/10.5194/egusphere-2023-1260

Preprints

23 Oct 2023

| 23 Oct 2023

A close look at using national ground stations for the statistical modeling of NO₂

Foeke Boersma and Meng Lu

Abstract. Air pollution causes a manifold of negative health and societal problems. It is therefore essential to model and predict air pollution over space. An increasing number of statistical models of air pollution have been developed using geospatial variables associated with air pollution emission and dispersion processes. However, the increasing number of air pollution models does not always equate to an increase in prediction accuracy and uncertainty reduction. An important aspect that is often disregarded is the spatial heterogeneity. In this study, we aim to evaluate and compare various spatial and non-spatial statistical and machine learning methods, with attention given to different spatial groups. Spatial groups are identified by the predictor variables. We found that prediction accuracy differs substantially in different spatial groups. Predictions in places close to roads with high populations show poor prediction accuracy, while prediction accuracy increases in low population density areas for both local and global models. Prediction accuracy is further increased in places that are far from roads for global models. This division into spatial groups also shows that global non-linear methods are capable of higher prediction accuracy than global linear methods. The spatial prediction patterns show that non-linear methods generally predict more smoothly than linear methods. Additionally, clusters of predicted air pollution differ within and between cities. Lastly, applying the same methods to the local dataset yields poor metrics, especially for the non-linear methods.

Received: 18 Jun 2023 – Discussion started: 23 Oct 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2684 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (2684 KB)

Supplement (6298 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

02 Oct 2025

A close look at using national ground stations for the statistical modeling of NO₂

Foeke Boersma and Meng Lu

Geosci. Model Dev., 18, 6717–6735, https://doi.org/10.5194/gmd-18-6717-2025,https://doi.org/10.5194/gmd-18-6717-2025, 2025

Short summary

Foeke Boersma and Meng Lu

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1260', Anonymous Referee #1, 09 Nov 2023
General comments:
Currently, many studies are using spatially sparse fixed-site measurements to map air pollution on a large scale, ignoring the local spatial heterogeneities such as the intra-city variations. This article evaluated the performance of various algorithms across different scales and validated the accuracy separately in subsets categorized by road density and population density. They found that the model performance varied significantly at different spatial locations. The pattern was found to be different in “global” and local models. The comparison between “global” and local models in terms of intra-city distribution patterns is valuable. However, in its present form, I cannot recommend the article for publication. With substantial revision and restructuring, this article could be a useful addition to the existing literature.
The writing needs further improvement. The current version is not easy to read. First, this is too long. I appreciate the solid work of the authors. But please simplify the main text and consider moving some descriptions/figures to the Appendix. Keep only the core story in the main text and make sure the primary findings and the most important messages stand out. Second, consider restructuring the method/data and the result section. Third, the caption of figures and tables needs more details, including the unit of NO₂. Fourth, Clarify definitions like “Far from road” vs “Rural”. Last, please pay attention to the tense usage.
Specific comments:
The mobile measurements from Kerckhoffs et. al., 2019 were measured on the road. How can they validate the accuracy for the “far from road” group? Did you perform any adjustments?

Table 1 describes the predictor features. Why not include land use proportions? Land Use Regression models are efficient and well-accepted methods in air pollution modeling.

Figure 4. It would be better to plot the map of differences between the model tested and the benchmark (i.e., NO2 estimations from Kerckhoffs et. al., 2019). I would be curious about the difference in spatial distributions between the “global” and local models.

A restructuring of the data/method section is recommended. Begin with the introduction of the data source, ensuring clarity on the source of the population information and road class when discussing spatial groups. Consider adding a table summarizing model input/algorithms for ease of understanding. Move some algorithm introductions to the appendix.

Please explain why 20-fold cross-validation?

Technical corrections:
I have listed some specific points. But not limited to them.
Abstract:
The abstract attempts to encompass numerous findings but allocates insufficient space to elucidate the methodology and experimental setting. A substantial rephrasing of the abstract is needed.
Line 1-5, toing and froing, can be simplified.
Line 6, please provide more details about the meaning of “spatial heterogeneity” in this context.
Line 9-10 what is the local and global model? Define first, before using it.
Methodology:
Line 100-105, not clear. How do you divide the area? Purpose? What is the time frame of these national measurements? Frequency of measuring? Any preprocessing? More details are needed here. How do you define the less densely populated area? What is the source of the population density data?
Line 121, “rural”= “Far from roads”? Please keep the terminology consistent.
Line 123, the label of models should be provided as the legend in the figure instead of in the caption.
Line 130-135, unit of NO₂ is missing. This paragraph is not informative. The values can be integrated into the figure 1.
Line 145, More details about kriging and accuracy are needed.
Line 160-165, is the traffic volume used as the annual average? Table 1. it would help readers to understand the data distribution by adding columns such as numbers and some statistics like mean, median, quantiles etc.
Line 168, the section title should begin with a capital letter, and further refinement is necessary in terms of formatting.
Line 190, not clear. Please do not refer to the citation but to the dataset you have described in section 2.1.
Line 195, rephrase please instead of a direct quote.
Line 196, details of the tuning strategy are missing.

Result and discussion:
Line 465, how do you compare the influence of predictors between cities? The feature importance is a relative value. The magnitude is not meaningful when compared to the other models.
Line 515, which is opposite to the common knowledge (see Hoek et. al., 2008). Can you explain why non-linear model predictions were smoother?
Citation: https://doi.org/10.5194/egusphere-2023-1260-RC1
RC2: 'Comment on egusphere-2023-1260', Anonymous Referee #2, 14 Jan 2024

Dear authors,
please, find attached an extended report about your paper.
Best regards

Citation: https://doi.org/10.5194/egusphere-2023-1260-RC2
AC1: 'Comment on egusphere-2023-1260', Foeke Boersma, 25 Mar 2024

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-1260/egusphere-2023-1260-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2023-1260-AC1

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1260', Anonymous Referee #1, 09 Nov 2023
General comments:
Currently, many studies are using spatially sparse fixed-site measurements to map air pollution on a large scale, ignoring the local spatial heterogeneities such as the intra-city variations. This article evaluated the performance of various algorithms across different scales and validated the accuracy separately in subsets categorized by road density and population density. They found that the model performance varied significantly at different spatial locations. The pattern was found to be different in “global” and local models. The comparison between “global” and local models in terms of intra-city distribution patterns is valuable. However, in its present form, I cannot recommend the article for publication. With substantial revision and restructuring, this article could be a useful addition to the existing literature.
The writing needs further improvement. The current version is not easy to read. First, this is too long. I appreciate the solid work of the authors. But please simplify the main text and consider moving some descriptions/figures to the Appendix. Keep only the core story in the main text and make sure the primary findings and the most important messages stand out. Second, consider restructuring the method/data and the result section. Third, the caption of figures and tables needs more details, including the unit of NO₂. Fourth, Clarify definitions like “Far from road” vs “Rural”. Last, please pay attention to the tense usage.
Specific comments:
The mobile measurements from Kerckhoffs et. al., 2019 were measured on the road. How can they validate the accuracy for the “far from road” group? Did you perform any adjustments?

Table 1 describes the predictor features. Why not include land use proportions? Land Use Regression models are efficient and well-accepted methods in air pollution modeling.

Figure 4. It would be better to plot the map of differences between the model tested and the benchmark (i.e., NO2 estimations from Kerckhoffs et. al., 2019). I would be curious about the difference in spatial distributions between the “global” and local models.

A restructuring of the data/method section is recommended. Begin with the introduction of the data source, ensuring clarity on the source of the population information and road class when discussing spatial groups. Consider adding a table summarizing model input/algorithms for ease of understanding. Move some algorithm introductions to the appendix.

Please explain why 20-fold cross-validation?

Technical corrections:
I have listed some specific points. But not limited to them.
Abstract:
The abstract attempts to encompass numerous findings but allocates insufficient space to elucidate the methodology and experimental setting. A substantial rephrasing of the abstract is needed.
Line 1-5, toing and froing, can be simplified.
Line 6, please provide more details about the meaning of “spatial heterogeneity” in this context.
Line 9-10 what is the local and global model? Define first, before using it.
Methodology:
Line 100-105, not clear. How do you divide the area? Purpose? What is the time frame of these national measurements? Frequency of measuring? Any preprocessing? More details are needed here. How do you define the less densely populated area? What is the source of the population density data?
Line 121, “rural”= “Far from roads”? Please keep the terminology consistent.
Line 123, the label of models should be provided as the legend in the figure instead of in the caption.
Line 130-135, unit of NO₂ is missing. This paragraph is not informative. The values can be integrated into the figure 1.
Line 145, More details about kriging and accuracy are needed.
Line 160-165, is the traffic volume used as the annual average? Table 1. it would help readers to understand the data distribution by adding columns such as numbers and some statistics like mean, median, quantiles etc.
Line 168, the section title should begin with a capital letter, and further refinement is necessary in terms of formatting.
Line 190, not clear. Please do not refer to the citation but to the dataset you have described in section 2.1.
Line 195, rephrase please instead of a direct quote.
Line 196, details of the tuning strategy are missing.

Result and discussion:
Line 465, how do you compare the influence of predictors between cities? The feature importance is a relative value. The magnitude is not meaningful when compared to the other models.
Line 515, which is opposite to the common knowledge (see Hoek et. al., 2008). Can you explain why non-linear model predictions were smoother?
Citation: https://doi.org/10.5194/egusphere-2023-1260-RC1
RC2: 'Comment on egusphere-2023-1260', Anonymous Referee #2, 14 Jan 2024

Dear authors,
please, find attached an extended report about your paper.
Best regards

Citation: https://doi.org/10.5194/egusphere-2023-1260-RC2
AC1: 'Comment on egusphere-2023-1260', Foeke Boersma, 25 Mar 2024

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-1260/egusphere-2023-1260-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2023-1260-AC1

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Foeke Boersma on behalf of the Authors (26 Mar 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (06 Apr 2024) by Klaus Klingmüller

RR by Anonymous Referee #1 (08 May 2024)

RR by Anonymous Referee #3 (27 May 2024)

ED: Reconsider after major revisions (12 Jun 2024) by Klaus Klingmüller

AR by Foeke Boersma on behalf of the Authors (12 Sep 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (11 Oct 2024) by Klaus Klingmüller

RR by Anonymous Referee #4 (25 Oct 2024)

RR by Anonymous Referee #1 (27 Oct 2024)

ED: Reconsider after major revisions (14 Nov 2024) by Klaus Klingmüller

AR by Foeke Boersma on behalf of the Authors (24 Dec 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (26 Jan 2025) by Klaus Klingmüller

RR by Anonymous Referee #4 (05 Feb 2025)

Suggestions for revision or reasons for rejection

I would like to thank the authors for the revised manuscript and for their detailed replies to the comments they received. The revised manuscript addresses most of my previous concerns. There are however a couple of points that I believe were not addressed adequately before publication:

1. While it’s true that gradient boosting algorithms optimize residuals iteratively, having too many estimators can lead to overfitting even after apparent convergence. The argument that "predictions stabilize" ignores that small changes in later trees can still accumulate to overfit the training data and it’s the reason that XGBoost includes regularization terms, as unlimited (or very large number of) boosting rounds can lead to overfitting. The authors should justify the inclusion of such a high number of estimators and provide evidence (i.e. Validation curves). The reasoning that further trees that necessary don’t affect the model is not valid, as later trees can still make small adjustments that collectively overfit. The authors can provide some empirical justification for this choice, like a curve of model performance with models trained on an increasing number of estimators until 50000. This can demonstrate whether 50,000 estimators were indeed necessary for convergence or not. The cited paper (Vezhnevets & Barinova, 2007) doesn't directly address modern gradient boosting implementation best practices.

2. The 20-fold cross-validation methodology followed does not adhere to the standard k-fold cross validation methodology, where the data is divided into k non-overlapping folds, where each data point appears exactly once in the test set. The methodology used in the manuscript adheres more to the Monte-Carlo Cross-Validation method and not k-fold cross-validation. The random split employed can lead to biased performance estimates, especially with small datasets (some points may be used multiple times while others might not be used at all). I recommend to use fewer folds (5-fold cross validation is generally a good balance between bias and variance and make sure each point is used once.

3. The manuscript would benefit from clearer methodology descriptions, particularly regarding the mixed-effects and kriging models. The authors should also provide more detailed information about data resolution and how predictions were generated at 100m resolution. The limitation of having the most heterogeneous group (urban) being the least represented in terms of data points should be more thoroughly discussed.

Despite these concerns, the study makes valuable contributions to understanding spatial heterogeneity in NO2 modeling and presents interesting comparisons between global and local modelling approaches. Once these methodological issues are addressed, the paper should provide useful insights for the air quality modelling community.

Hide

ED: Publish subject to minor revisions (review by editor) (06 Feb 2025) by Klaus Klingmüller

AR by Foeke Boersma on behalf of the Authors (04 Apr 2025) Author's response Author's tracked changes Manuscript

ED: Publish subject to minor revisions (review by editor) (09 Apr 2025) by Klaus Klingmüller

AR by Foeke Boersma on behalf of the Authors (11 Apr 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (15 Apr 2025) by Klaus Klingmüller

AR by Foeke Boersma on behalf of the Authors (23 Apr 2025) Author's response Manuscript

Journal article(s) based on this preprint

02 Oct 2025

A close look at using national ground stations for the statistical modeling of NO₂

Foeke Boersma and Meng Lu

Geosci. Model Dev., 18, 6717–6735, https://doi.org/10.5194/gmd-18-6717-2025,https://doi.org/10.5194/gmd-18-6717-2025, 2025

Short summary

Foeke Boersma and Meng Lu

Supplement

https://doi.org/10.5194/egusphere-2023-1260-supplement

Foeke Boersma and Meng Lu

Viewed

Total article views: 2,416 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,747	585	84	2,416	161	124	185

HTML: 1,747
PDF: 585
XML: 84
Total: 2,416
Supplement: 161
BibTeX: 124
EndNote: 185

Views and downloads (calculated since 23 Oct 2023)

Month	HTML	PDF	XML	Total
Oct 2023	75	24	5	104
Nov 2023	24	2	1	27
Dec 2023	14	10	1	25
Jan 2024	28	10	2	40
Feb 2024	9	5	2	16
Mar 2024	21	11	3	35
Apr 2024	25	8	5	38
May 2024	10	8	0	18
Jun 2024	14	9	3	26
Jul 2024	48	16	6	70
Aug 2024	28	2	2	32
Sep 2024	38	6	0	44
Oct 2024	28	8	0	36
Nov 2024	44	2	2	48
Dec 2024	14	18	0	32
Jan 2025	24	12	0	36
Feb 2025	20	16	6	42
Mar 2025	16	10	2	28
Apr 2025	30	28	0	58
May 2025	26	20	2	48
Jun 2025	30	24	2	56
Jul 2025	26	20	2	48
Aug 2025	144	20	0	164
Sep 2025	744	32	10	786
Oct 2025	42	28	0	70
Nov 2025	34	46	6	86
Dec 2025	20	54	4	78
Jan 2026	42	36	4	82
Feb 2026	48	16	6	70
Mar 2026	44	32	6	82
Apr 2026	15	16	1	32
May 2026	20	34	1	55
Jun 2026	2	2	0	4

Cumulative views and downloads (calculated since 23 Oct 2023)

Month	HTML	PDF	XML	Total
Oct 2023	75	24	5	104
Nov 2023	24	2	1	27
Dec 2023	14	10	1	25
Jan 2024	28	10	2	40
Feb 2024	9	5	2	16
Mar 2024	21	11	3	35
Apr 2024	25	8	5	38
May 2024	10	8	0	18
Jun 2024	14	9	3	26
Jul 2024	48	16	6	70
Aug 2024	28	2	2	32
Sep 2024	38	6	0	44
Oct 2024	28	8	0	36
Nov 2024	44	2	2	48
Dec 2024	14	18	0	32
Jan 2025	24	12	0	36
Feb 2025	20	16	6	42
Mar 2025	16	10	2	28
Apr 2025	30	28	0	58
May 2025	26	20	2	48
Jun 2025	30	24	2	56
Jul 2025	26	20	2	48
Aug 2025	144	20	0	164
Sep 2025	744	32	10	786
Oct 2025	42	28	0	70
Nov 2025	34	46	6	86
Dec 2025	20	54	4	78
Jan 2026	42	36	4	82
Feb 2026	48	16	6	70
Mar 2026	44	32	6	82
Apr 2026	15	16	1	32
May 2026	20	34	1	55
Jun 2026	2	2	0	4

Viewed (geographical distribution)

Total article views: 2,399 (including HTML, PDF, and XML) Thereof 2,399 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 25 Jun 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (2684 KB)
Metadata XML

Short summary

Air pollution harms health and society. Understanding and predicting it is crucial. Various models are developed to model air pollution. However, the consistency exhibited by a model in different areas is commonly neglected. Our study accounts for this and shows lower accuracy near busy roads, but higher in less populated areas. Considering location characteristics in air pollution predictions is important in comparing statistical models and understanding the health-society-space relationship.


Total:	0
HTML:	0
PDF:	0
XML:	0

A close look at using national ground stations for the statistical modeling of NO2

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Supplement

Viewed

Viewed (geographical distribution)

A close look at using national ground stations for the statistical modeling of NO₂