the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Development of an integrated model framework for multi-air-pollutant exposure assessments in high-density cities and the implications for epidemiological research
Abstract. Exposure models for some criteria air pollutants have been intensively developed in past research; multi-air-pollutant exposure models, especially for particulate chemical species, have been however overlooked in Asia. Lack of an integrated model framework to calculate multi-air-pollutant exposure hinders the combined exposure assessment and the corresponding health assessment. This work applied the land-use regression (LUR) approach to develop an integrated model framework to estimate annual-average exposures of four major PM10 chemical species as well as four criteria air pollutants of PM10, PM2.5, NO2, and O3 in a typical high-rise and high-density Asian city (Hong Kong, China). Our integrated multi-air-pollutant exposure model framework is capable of explaining 91–97 % of the variability of measured air pollutant concentration, with the leave-one-out cross-validation R2 values ranging from 0.73 to 0.93. Using the model framework, the spatial distribution of the concentration of various air pollutants at a spatial resolution of 500 m was generated. The LUR model-derived spatial distribution maps revealed weak to moderate spatial correlations between the PM10 chemical species and the criteria air pollutants, which may help to distinguish their independent chronic health effects. In addition, further improvements in the development of air pollution exposure models are discussed. This study proposes an integrated model framework for estimating multi-air-pollutant exposure in high-density and high-rise urban areas, serving an important tool for combined exposure assessment and the corresponding epidemiological studies.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(899 KB)
-
Supplement
(1556 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(899 KB) - Metadata XML
-
Supplement
(1556 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-513', Anonymous Referee #1, 17 May 2023
The authors present results from a land use regression (LUR) framework used to create 500m resolution exposure fields for multiple pollutants, including NO2, O3, total PM1- and PM2.5, and multiple PM10 species. Their LUR models have good predictive performance in leave one out cross validation. I agree with the authors that the fine-scale exposure products may be useful for future exposure and epidemiological studies. I believe, however, that the study does not go far enough in explaining the implications of their results. In particular, I find it is lacking in two areas relevant to the scope of ACP.
First, very little description is given of the meaning behind the variables that are selected for the LUR models, including why they are predictive of the various species and what they may be proxies for. Did any variables that have been found predictive not make sense? Are there co-linearities that may be obscuring the influence of some variables over others? Can we learn something from the predictors chosen that can inform policies to reduce exposure?
Second, there is little description provided of the exposure products themselves. What are the implications for human exposure? It would be useful to select important areas of the city (e.g., an area with high population density), describe which pollutants are predicted to have high concentrations, and offer some suggestions about why.
Specific comments
Abstract: it is important to describe the exposure product in full, including the temporal coverage (i.e., which year?)
Line 81-82: it strikes me as strange to have 3 citations for a sentence describing the geography of Hong Kong.
122-123: it would be helpful to explain more about why lat/lon would account appropriately for transboundary pollution
134-135: please explain “if the direction was as pre-defined”
135: model selection process: please confirm whether the model selection R2 was calculated on the training dataset or the hold out dataset. If the training dataset, is there risk of over-fitting?
135: It seems to me that more flexible machine learning methods may be more adept at capturing nonlinear relationships between environmental predictors and measured pollutants. Is it expected that these variables have a linear relationship with pollutant species? The final paragraph of the manuscript mentions that these are more appropriate with more data, but there is no evidence given or description of how much data is needed.
167: “either comparable to or higher than” makes it sound like a dichotomous variable. I think “generally greater than” would describe this well enough.
176-178: I do not understand how the selection of the predictor variables was related to these other factors. Please clarify.
192: This line refers to a negative z score close to 1. Does this suggest anti-spatial correlation for Cd? More description of this variable is needed
Could the authors provide spatial maps of error at monitor locations? In general, it would be useful to develop estimates of uncertainty on the the same spatial scale as the predictions.
220: is this implying that population density leads to high PM? Can the authors suggest a mechanism here that isn’t explained by the other model covariates?
239-241: I am confused about the difference between the GAS and PM modules. Is it necessary to differentiate beyond the pollutant species names?
257-260: I am not sure what this adds to the discussion. Can the authors be more specific here based on the predictors chosen for the final model?
How was 500m selected as the best resolution for the predictions?
Citation: https://doi.org/10.5194/egusphere-2023-513-RC1 -
AC1: 'Reply on RC1', Steve Hung Lam Yim, 21 Aug 2023
Dear referee,
Thank you for your comments and suggestions. We carefully addressed them one by one as shown below. Hope you find our revisions useful. Thank you again.
Regards,
YIM, Hung-Lam Steve, Ph.D.
Associate Professor, Asian School of the Environment
Associate Professor, Lee Kong Chian School of Medicine
Principal Investigator, Earth Observatory of Singapore
Nanyang Technological University (NTU), Singapore
Email: steve.yim@ntu.edu.sg
ASE@NTU: https://www.ntu.edu.sg/ase/aboutus/staff-directory/staff-details/yim-hung-lam-steve
Address: Block N2-01C-44, Asian School of the Environment, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798
-
AC1: 'Reply on RC1', Steve Hung Lam Yim, 21 Aug 2023
-
RC2: 'Comment on egusphere-2023-513', Anonymous Referee #2, 05 Jul 2023
The author employs a land-use regression (LUR) approach in developing an integrated model framework to estimate the annual average exposures of four primary PM10 chemical species and four criteria air pollutants—PM10, PM2.5, NO2, and O3—within the context of a typical high-rise and high-density Asian city, namely Hong Kong, China. The methodology leverages annual concentrations of these air pollutants as captured by monitoring stations in Hong Kong. However, it appears there are only sixteen data points available for analysis, and in certain cases, even less for some pollutants.
Here are some significant queries I would like the author to address:
The choice of 2017 as the year of interest is based on the completeness of data. Nevertheless, how is this year relevant to the epidemiological data that the author plans to investigate further? The applicability of the annual model to current epidemiological studies seems somewhat limited. Given the study's title, could the author shed more light on this concern?
The author refers to the work as an "integrated model framework". However, I can only discern individual LUR models designated for each pollutant. Could the author elaborate on how the integration of this model framework was accomplished?
The models were specifically developed for "high-density cities". Still, it appears that there is a mismatch between the sparse density of monitoring stations and the actual population density. Could the author provide further clarification on this discrepancy?
Overall, it appears that the number of data points presents a significant limitation in this study. Would the author consider adopting a spatio-temporal model to incorporate a larger data set with finer temporal resolution?
Due to these major issues, I do not want to go into the details of this article.
Citation: https://doi.org/10.5194/egusphere-2023-513-RC2 -
AC2: 'Reply on RC2', Steve Hung Lam Yim, 21 Aug 2023
Dear referee,
Thank you for your comments and suggestions. We carefully addressed them one by one as shown below. Hope you find our revisions useful. Thank you again.
Regards,
YIM, Hung-Lam Steve, Ph.D.
Associate Professor, Asian School of the Environment
Associate Professor, Lee Kong Chian School of Medicine
Principal Investigator, Earth Observatory of Singapore
Nanyang Technological University (NTU), Singapore
Email: steve.yim@ntu.edu.sg
ASE@NTU: https://www.ntu.edu.sg/ase/aboutus/staff-directory/staff-details/yim-hung-lam-steve
Address: Block N2-01C-44, Asian School of the Environment, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798
-
AC2: 'Reply on RC2', Steve Hung Lam Yim, 21 Aug 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-513', Anonymous Referee #1, 17 May 2023
The authors present results from a land use regression (LUR) framework used to create 500m resolution exposure fields for multiple pollutants, including NO2, O3, total PM1- and PM2.5, and multiple PM10 species. Their LUR models have good predictive performance in leave one out cross validation. I agree with the authors that the fine-scale exposure products may be useful for future exposure and epidemiological studies. I believe, however, that the study does not go far enough in explaining the implications of their results. In particular, I find it is lacking in two areas relevant to the scope of ACP.
First, very little description is given of the meaning behind the variables that are selected for the LUR models, including why they are predictive of the various species and what they may be proxies for. Did any variables that have been found predictive not make sense? Are there co-linearities that may be obscuring the influence of some variables over others? Can we learn something from the predictors chosen that can inform policies to reduce exposure?
Second, there is little description provided of the exposure products themselves. What are the implications for human exposure? It would be useful to select important areas of the city (e.g., an area with high population density), describe which pollutants are predicted to have high concentrations, and offer some suggestions about why.
Specific comments
Abstract: it is important to describe the exposure product in full, including the temporal coverage (i.e., which year?)
Line 81-82: it strikes me as strange to have 3 citations for a sentence describing the geography of Hong Kong.
122-123: it would be helpful to explain more about why lat/lon would account appropriately for transboundary pollution
134-135: please explain “if the direction was as pre-defined”
135: model selection process: please confirm whether the model selection R2 was calculated on the training dataset or the hold out dataset. If the training dataset, is there risk of over-fitting?
135: It seems to me that more flexible machine learning methods may be more adept at capturing nonlinear relationships between environmental predictors and measured pollutants. Is it expected that these variables have a linear relationship with pollutant species? The final paragraph of the manuscript mentions that these are more appropriate with more data, but there is no evidence given or description of how much data is needed.
167: “either comparable to or higher than” makes it sound like a dichotomous variable. I think “generally greater than” would describe this well enough.
176-178: I do not understand how the selection of the predictor variables was related to these other factors. Please clarify.
192: This line refers to a negative z score close to 1. Does this suggest anti-spatial correlation for Cd? More description of this variable is needed
Could the authors provide spatial maps of error at monitor locations? In general, it would be useful to develop estimates of uncertainty on the the same spatial scale as the predictions.
220: is this implying that population density leads to high PM? Can the authors suggest a mechanism here that isn’t explained by the other model covariates?
239-241: I am confused about the difference between the GAS and PM modules. Is it necessary to differentiate beyond the pollutant species names?
257-260: I am not sure what this adds to the discussion. Can the authors be more specific here based on the predictors chosen for the final model?
How was 500m selected as the best resolution for the predictions?
Citation: https://doi.org/10.5194/egusphere-2023-513-RC1 -
AC1: 'Reply on RC1', Steve Hung Lam Yim, 21 Aug 2023
Dear referee,
Thank you for your comments and suggestions. We carefully addressed them one by one as shown below. Hope you find our revisions useful. Thank you again.
Regards,
YIM, Hung-Lam Steve, Ph.D.
Associate Professor, Asian School of the Environment
Associate Professor, Lee Kong Chian School of Medicine
Principal Investigator, Earth Observatory of Singapore
Nanyang Technological University (NTU), Singapore
Email: steve.yim@ntu.edu.sg
ASE@NTU: https://www.ntu.edu.sg/ase/aboutus/staff-directory/staff-details/yim-hung-lam-steve
Address: Block N2-01C-44, Asian School of the Environment, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798
-
AC1: 'Reply on RC1', Steve Hung Lam Yim, 21 Aug 2023
-
RC2: 'Comment on egusphere-2023-513', Anonymous Referee #2, 05 Jul 2023
The author employs a land-use regression (LUR) approach in developing an integrated model framework to estimate the annual average exposures of four primary PM10 chemical species and four criteria air pollutants—PM10, PM2.5, NO2, and O3—within the context of a typical high-rise and high-density Asian city, namely Hong Kong, China. The methodology leverages annual concentrations of these air pollutants as captured by monitoring stations in Hong Kong. However, it appears there are only sixteen data points available for analysis, and in certain cases, even less for some pollutants.
Here are some significant queries I would like the author to address:
The choice of 2017 as the year of interest is based on the completeness of data. Nevertheless, how is this year relevant to the epidemiological data that the author plans to investigate further? The applicability of the annual model to current epidemiological studies seems somewhat limited. Given the study's title, could the author shed more light on this concern?
The author refers to the work as an "integrated model framework". However, I can only discern individual LUR models designated for each pollutant. Could the author elaborate on how the integration of this model framework was accomplished?
The models were specifically developed for "high-density cities". Still, it appears that there is a mismatch between the sparse density of monitoring stations and the actual population density. Could the author provide further clarification on this discrepancy?
Overall, it appears that the number of data points presents a significant limitation in this study. Would the author consider adopting a spatio-temporal model to incorporate a larger data set with finer temporal resolution?
Due to these major issues, I do not want to go into the details of this article.
Citation: https://doi.org/10.5194/egusphere-2023-513-RC2 -
AC2: 'Reply on RC2', Steve Hung Lam Yim, 21 Aug 2023
Dear referee,
Thank you for your comments and suggestions. We carefully addressed them one by one as shown below. Hope you find our revisions useful. Thank you again.
Regards,
YIM, Hung-Lam Steve, Ph.D.
Associate Professor, Asian School of the Environment
Associate Professor, Lee Kong Chian School of Medicine
Principal Investigator, Earth Observatory of Singapore
Nanyang Technological University (NTU), Singapore
Email: steve.yim@ntu.edu.sg
ASE@NTU: https://www.ntu.edu.sg/ase/aboutus/staff-directory/staff-details/yim-hung-lam-steve
Address: Block N2-01C-44, Asian School of the Environment, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798
-
AC2: 'Reply on RC2', Steve Hung Lam Yim, 21 Aug 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
461 | 171 | 26 | 658 | 62 | 11 | 11 |
- HTML: 461
- PDF: 171
- XML: 26
- Total: 658
- Supplement: 62
- BibTeX: 11
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Zhiyuan Li
Kin-Fai Ho
Harry Fung Lee
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(899 KB) - Metadata XML
-
Supplement
(1556 KB) - BibTeX
- EndNote
- Final revised paper