the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Post-process correction improves the accuracy of satellite PM2.5 retrievals
Abstract. Estimates of PM2.5 levels are crucial for monitoring air quality and studying the epidemiological impact of air quality on the population. Currently, the most precise measurements of PM2.5 are obtained from ground stations, resulting in limited spatial coverage. In this study, we consider satellite-based PM2.5 retrieval, which involves conversion of high-resolution satellite retrieval of Aerosol Optical Depth (AOD) into high-resolution PM2.5 retrieval. To improve the accuracy of the AOD to PM2.5 conversion, we employ the machine learning based post-process correction to correct the AOD-to-PM conversion ratio derived from Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis model data. The post-process correction approach utilizes a fusion and downscaling of satellite observation and retrieval data, MERRA-2 reanalysis data, various high resolution geographical indicators, meteorological data and ground station observations for learning a predictor for the approximation error in the AOD to PM2.5 conversion ratio. The corrected conversion ratio is then applied to estimate PM2.5 levels given the high-resolution satellite AOD retrieval data derived from Sentinel-3 observations. Our model produces PM2.5 estimates with a spatial resolution of 100 meters at satellite overpass times. Additionally, we have incorporated an ensemble of neural networks to provide error envelopes for machine learning related uncertainty in the PM2.5 estimates.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(8923 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8923 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2635', Anonymous Referee #1, 31 Jan 2024
Review of “Post-process correction improves the accuracy of satellite PM2.5 retrievals”
General comments:
This study designs a new approach to improve the accuracy of the PM2.5 concentrations derived from satellite AOD retrievals through the machine learning approach. Unlike most studies that build the relationship between PM2.5 and AOD, this research corrects the ratio between PM2.5 and AOD derived from MERRA2 and applies the improved ratio to satellite AOD to estimate surface PM2.5 concentrations. Although this is a new approach, the advantage of this method is not clear. Below are my specific comments.
Specific comments:
- There are many researches that focus on using AOD to estimate PM2.5 through machine learning approaches. Compared with them, what is the innovation of this study? I understand that this study corrects the ratio between PM2.5 and AOD derived from MERRA2 and applies the improved ratio to satellite AOD to estimate surface PM2.5 concentrations, which is different from other researches that estimate PM2.5 directly. Although this is a new approach, what is the advantage of this study. Compared with previous researches, can the new approach provide better PM2.5 estimation?
- This exclude the PM2.5 concentrations that are larger than 80 μg/m3. This would reduce the importance of this study as the research community is more interested in heavy polluted scene. The authors excluded that condition that PM2.5 concentrations that are larger than 80 μg/m3 due to imbalanced data (only a small set of data with PM2.5 concentrations that are larger than 80 μg/m3). Can the problem is solved through bagging or other approaches?
- More information of satellite data is needed. What is the temporal resolution and swath of the sensor?
- This study only demonstrates the validations in Fig. 4. It would be more interesting to show the fitting (training) as well.
- In Fig.4, why monthly mean shows larger bias than instant estimations?
- Latitude and longitude are missing in the top panels of Figs. 5 and 6.
- Line 78-80: I cannot understand. More details are needed for the method description.
- Why CALIOP data are used. This is monthly mean data, but PM2.5 and AOD has strong diurnal variation. Can the CALIOP data help to improve PM2.5 estimation?
- The unit of RMSE is missing throughout the paper.
Citation: https://doi.org/10.5194/egusphere-2023-2635-RC1 - AC3: 'Reply on RC1', Andrea Porcheddu, 15 Mar 2024
-
CC1: 'Group comment on egusphere-2023-2635', Adam Povey, 13 Feb 2024
Supriya Mantri, Laura Horton, Adam Povey
National Centre for Earth Observation, University of Leicester, Space Park, Leicester, LE4 5SP, UKThe paper describes the importance of high resolution PM2.5 data for air quality monitoring and health studies. It suggests a new machine learning technique for correcting the AOD-to-PM2.5 ratio using inputs from Sentinel-3, MERRA-2 reanalysis, and high-resolution geographical indicators. That post-processed corrected AOD-to-PM2.5 ratio was used to estimate the PM2.5, which was then compared to OPEN-AQ ground stations across Europe (specifically, Paris and Madrid) for 2019. The product was shown to work best for low PM2.5 values.
As researchers of the interface between satellite aerosol observations and air quality, we found this an interesting and inspirational publication which we enjoyed discussing and would like to see published. We share some minor corrections and comments for the authors to consider:
- The introduction was clear with good motivations and references but, in line 25, the reference to the WHO air quality guideline does not specify the time frame. Is it an annual average?
- Is it possible to give any additional reasoning behind the use of hourly downscaling of daily PM averages? This strikes us as one of the more consequential choices in this methodology and neglects complexities such as the diurnal cycle. (We aren’t questioning the authors’ judgement, nor disagreeing; merely curious as we wouldn’t have thought of this.)
- In section 2.3 the authors mention they use MERRA-2 reanalysis variables as an input for the model and provide a lengthy list in the appendix. Have all of the available variables been used as an input or are there some variables not included? It may be helpful for those designing similar algorithms if you also mention which variables were not included and, if so, why not?
- Could you provide a reference for the RH equation on line 93? Or is it at standard temperature/pressure?
- In the NASA Black Marble Night Light section (2.6.2), the authors could include a reference to the data set used, especially a DOI so we can distinguish which of the four available datasets was used.
- We found Sections 3.3 and 3.4 quite opaque. They could be improved by adding more specific details about the process, such as the equations used as an input to the correction model. A flow chart of the steps used in the approach would make it more clear for an audience which is less familiar with machine learning and neural networks.
- Those of us with little experience with neural networks did not understand what Figure 2 wished to convey and those of us used to neural networks felt Figure 2 adds little to the text; it could be removed.
- At line 183, it would be helpful to have an understanding of what is meant by “slightly better”?
- Figure 3 could be improved as it is hard to distinguish between lines, perhaps using a filled histogram of stacked bars.
- We are curious if the authors considered any methods to amplify the availability of high PM2.5 observations, such as data augmentation? Our understanding was that the balancing of training data is an important step in constructing a neural network to recognise rare events and we would value the authors’ opinion.
- From line 223, what do you mean by “the fully learned approach were less accurate than with the post correction approach”? What metric of accuracy was used and how significant was the difference? This would help guide our own efforts in neural network generation.
- In Figure 4, are the authors certain about the plotting of panel B? Compared to panel A there appears to be some duplication of points up the y-axis. For example, there are three points at the extreme right of (A) but over ten in (B). This effect is not exhibited in panels D-F.
- In Figure 5 (comparison of the uncorrected and corrected methods at the ground stations), do the authors have any understanding of why there is a large discrepancy between OpenAQ and both satellite estimates for most of the sites (i.e. the blue dots do not overlap the red line in 5/9 cases shown). Is it because of some local source (e.g. roads or small industrial buildings) in close proximity of the stations that isn’t present in Madrid?
- In Figures 5 and 6, could the dots representing sites and arrows indicating one be made substantially larger and outlined with a colour not in the plot (such as green or blue)? Our older member had failed to see them on his own.
- In Figure 7 bottom left (Hull Freetown) why is the uncertainty so large April? What are the possible reasons for high PM2.5 in February?
- If practical, it would be interesting to add an appendix highlighting a few ensemble members for Fig 5 and/or 6. This would demonstrate if the smoothness of the fields shown at the top right of those figures is due to the action of the neural network or due to the median filter over the ensemble.
- It would be interesting to hear if training the model over an area with higher PM2.5 levels, such as South Africa or India, and then testing over central Europe improves the model’s performance, particularly for higher PM2.5 values.
Some typographic corrections,
L3: we consider a satellite-based
L4: satellite retrievals of Aerosol
L5: we employ a machine learning
L33: retrievals and an AOD-to-PM
L45: PM2.5 at a spatial resolution
L54: estimation of the AOD-to-PM
L62: The Sentinel-3 POPCORN AOD product is based on the post-process corrected
L76: by station; 1-hour
L154: due to the huge dimensionality
L181: A linear activation
L190: was divided into training
L248: is obtained by post-correcting
L249: to obtain similar post-process [this is actually a meaningful change; as written you say ‘products corrected in a similar manner’ and our revision says ‘similar products that have been corrected']Citation: https://doi.org/10.5194/egusphere-2023-2635-CC1 - AC1: 'Reply on CC1', Andrea Porcheddu, 15 Mar 2024
-
RC2: 'Comment on egusphere-2023-2635', Anonymous Referee #2, 19 Feb 2024
In this paper, the authors aimed to enhance the accuracy of satellite-derived PM2.5 estimates through post-process correction using a network model. While some improvement in the results was observed, there are significant concerns regarding the data used, validation method, and the significance of the findings compared to previous studies. These issues warrant further examination and may impact the overall validity and applicability of the study's conclusions.
Specific comments:
Abstract:
The study lacks major conclusions and quantitative descriptive results.
Introduction:
The introduction is very short and lacks a comprehensive review of numerous previous studies on converting AOD to PM2.5 using machine learning models.
Data:
The use of MERRA2-2 for calculating PM2.5 is criticized for its inaccuracies and omission of certain species like Nitrate. It is suggested to consider using GEOS-CF data, which provides PM2.5 simulations at a higher resolution of 0.25 degrees.
Section 2.6:
The spatial resolution of high-resolution indicators such as roads and nighttime lights needs clarification.
The excessive number of variables selected raises questions about their relevance and contribution to the network model. It would be beneficial to employ importance analysis methods to identify and eliminate redundant variables. This process will streamline the model and improve its efficiency and interpretability.
Figure 3: It is unclear how the training and validation stations are divided. Additionally, the proximity of stations may lead to correlation issues, affecting the independence of training and testing samples spatially.
Section 3.4: The rationale for choosing the neural network model over other more powerful machine learning and deep learning models is not provided. The advantages of this model should be discussed.
Figure 4: While the accuracy has improved, the correlation remains relatively low (only 0.63), compared to previous studies achieving higher accuracy with AI (R2 higher than 0.8). The significance of the study is questioned, and comparison with previous studies to assess improvement is recommended.
Citation: https://doi.org/10.5194/egusphere-2023-2635-RC2 - AC2: 'Reply on RC2', Andrea Porcheddu, 15 Mar 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2635', Anonymous Referee #1, 31 Jan 2024
Review of “Post-process correction improves the accuracy of satellite PM2.5 retrievals”
General comments:
This study designs a new approach to improve the accuracy of the PM2.5 concentrations derived from satellite AOD retrievals through the machine learning approach. Unlike most studies that build the relationship between PM2.5 and AOD, this research corrects the ratio between PM2.5 and AOD derived from MERRA2 and applies the improved ratio to satellite AOD to estimate surface PM2.5 concentrations. Although this is a new approach, the advantage of this method is not clear. Below are my specific comments.
Specific comments:
- There are many researches that focus on using AOD to estimate PM2.5 through machine learning approaches. Compared with them, what is the innovation of this study? I understand that this study corrects the ratio between PM2.5 and AOD derived from MERRA2 and applies the improved ratio to satellite AOD to estimate surface PM2.5 concentrations, which is different from other researches that estimate PM2.5 directly. Although this is a new approach, what is the advantage of this study. Compared with previous researches, can the new approach provide better PM2.5 estimation?
- This exclude the PM2.5 concentrations that are larger than 80 μg/m3. This would reduce the importance of this study as the research community is more interested in heavy polluted scene. The authors excluded that condition that PM2.5 concentrations that are larger than 80 μg/m3 due to imbalanced data (only a small set of data with PM2.5 concentrations that are larger than 80 μg/m3). Can the problem is solved through bagging or other approaches?
- More information of satellite data is needed. What is the temporal resolution and swath of the sensor?
- This study only demonstrates the validations in Fig. 4. It would be more interesting to show the fitting (training) as well.
- In Fig.4, why monthly mean shows larger bias than instant estimations?
- Latitude and longitude are missing in the top panels of Figs. 5 and 6.
- Line 78-80: I cannot understand. More details are needed for the method description.
- Why CALIOP data are used. This is monthly mean data, but PM2.5 and AOD has strong diurnal variation. Can the CALIOP data help to improve PM2.5 estimation?
- The unit of RMSE is missing throughout the paper.
Citation: https://doi.org/10.5194/egusphere-2023-2635-RC1 - AC3: 'Reply on RC1', Andrea Porcheddu, 15 Mar 2024
-
CC1: 'Group comment on egusphere-2023-2635', Adam Povey, 13 Feb 2024
Supriya Mantri, Laura Horton, Adam Povey
National Centre for Earth Observation, University of Leicester, Space Park, Leicester, LE4 5SP, UKThe paper describes the importance of high resolution PM2.5 data for air quality monitoring and health studies. It suggests a new machine learning technique for correcting the AOD-to-PM2.5 ratio using inputs from Sentinel-3, MERRA-2 reanalysis, and high-resolution geographical indicators. That post-processed corrected AOD-to-PM2.5 ratio was used to estimate the PM2.5, which was then compared to OPEN-AQ ground stations across Europe (specifically, Paris and Madrid) for 2019. The product was shown to work best for low PM2.5 values.
As researchers of the interface between satellite aerosol observations and air quality, we found this an interesting and inspirational publication which we enjoyed discussing and would like to see published. We share some minor corrections and comments for the authors to consider:
- The introduction was clear with good motivations and references but, in line 25, the reference to the WHO air quality guideline does not specify the time frame. Is it an annual average?
- Is it possible to give any additional reasoning behind the use of hourly downscaling of daily PM averages? This strikes us as one of the more consequential choices in this methodology and neglects complexities such as the diurnal cycle. (We aren’t questioning the authors’ judgement, nor disagreeing; merely curious as we wouldn’t have thought of this.)
- In section 2.3 the authors mention they use MERRA-2 reanalysis variables as an input for the model and provide a lengthy list in the appendix. Have all of the available variables been used as an input or are there some variables not included? It may be helpful for those designing similar algorithms if you also mention which variables were not included and, if so, why not?
- Could you provide a reference for the RH equation on line 93? Or is it at standard temperature/pressure?
- In the NASA Black Marble Night Light section (2.6.2), the authors could include a reference to the data set used, especially a DOI so we can distinguish which of the four available datasets was used.
- We found Sections 3.3 and 3.4 quite opaque. They could be improved by adding more specific details about the process, such as the equations used as an input to the correction model. A flow chart of the steps used in the approach would make it more clear for an audience which is less familiar with machine learning and neural networks.
- Those of us with little experience with neural networks did not understand what Figure 2 wished to convey and those of us used to neural networks felt Figure 2 adds little to the text; it could be removed.
- At line 183, it would be helpful to have an understanding of what is meant by “slightly better”?
- Figure 3 could be improved as it is hard to distinguish between lines, perhaps using a filled histogram of stacked bars.
- We are curious if the authors considered any methods to amplify the availability of high PM2.5 observations, such as data augmentation? Our understanding was that the balancing of training data is an important step in constructing a neural network to recognise rare events and we would value the authors’ opinion.
- From line 223, what do you mean by “the fully learned approach were less accurate than with the post correction approach”? What metric of accuracy was used and how significant was the difference? This would help guide our own efforts in neural network generation.
- In Figure 4, are the authors certain about the plotting of panel B? Compared to panel A there appears to be some duplication of points up the y-axis. For example, there are three points at the extreme right of (A) but over ten in (B). This effect is not exhibited in panels D-F.
- In Figure 5 (comparison of the uncorrected and corrected methods at the ground stations), do the authors have any understanding of why there is a large discrepancy between OpenAQ and both satellite estimates for most of the sites (i.e. the blue dots do not overlap the red line in 5/9 cases shown). Is it because of some local source (e.g. roads or small industrial buildings) in close proximity of the stations that isn’t present in Madrid?
- In Figures 5 and 6, could the dots representing sites and arrows indicating one be made substantially larger and outlined with a colour not in the plot (such as green or blue)? Our older member had failed to see them on his own.
- In Figure 7 bottom left (Hull Freetown) why is the uncertainty so large April? What are the possible reasons for high PM2.5 in February?
- If practical, it would be interesting to add an appendix highlighting a few ensemble members for Fig 5 and/or 6. This would demonstrate if the smoothness of the fields shown at the top right of those figures is due to the action of the neural network or due to the median filter over the ensemble.
- It would be interesting to hear if training the model over an area with higher PM2.5 levels, such as South Africa or India, and then testing over central Europe improves the model’s performance, particularly for higher PM2.5 values.
Some typographic corrections,
L3: we consider a satellite-based
L4: satellite retrievals of Aerosol
L5: we employ a machine learning
L33: retrievals and an AOD-to-PM
L45: PM2.5 at a spatial resolution
L54: estimation of the AOD-to-PM
L62: The Sentinel-3 POPCORN AOD product is based on the post-process corrected
L76: by station; 1-hour
L154: due to the huge dimensionality
L181: A linear activation
L190: was divided into training
L248: is obtained by post-correcting
L249: to obtain similar post-process [this is actually a meaningful change; as written you say ‘products corrected in a similar manner’ and our revision says ‘similar products that have been corrected']Citation: https://doi.org/10.5194/egusphere-2023-2635-CC1 - AC1: 'Reply on CC1', Andrea Porcheddu, 15 Mar 2024
-
RC2: 'Comment on egusphere-2023-2635', Anonymous Referee #2, 19 Feb 2024
In this paper, the authors aimed to enhance the accuracy of satellite-derived PM2.5 estimates through post-process correction using a network model. While some improvement in the results was observed, there are significant concerns regarding the data used, validation method, and the significance of the findings compared to previous studies. These issues warrant further examination and may impact the overall validity and applicability of the study's conclusions.
Specific comments:
Abstract:
The study lacks major conclusions and quantitative descriptive results.
Introduction:
The introduction is very short and lacks a comprehensive review of numerous previous studies on converting AOD to PM2.5 using machine learning models.
Data:
The use of MERRA2-2 for calculating PM2.5 is criticized for its inaccuracies and omission of certain species like Nitrate. It is suggested to consider using GEOS-CF data, which provides PM2.5 simulations at a higher resolution of 0.25 degrees.
Section 2.6:
The spatial resolution of high-resolution indicators such as roads and nighttime lights needs clarification.
The excessive number of variables selected raises questions about their relevance and contribution to the network model. It would be beneficial to employ importance analysis methods to identify and eliminate redundant variables. This process will streamline the model and improve its efficiency and interpretability.
Figure 3: It is unclear how the training and validation stations are divided. Additionally, the proximity of stations may lead to correlation issues, affecting the independence of training and testing samples spatially.
Section 3.4: The rationale for choosing the neural network model over other more powerful machine learning and deep learning models is not provided. The advantages of this model should be discussed.
Figure 4: While the accuracy has improved, the correlation remains relatively low (only 0.63), compared to previous studies achieving higher accuracy with AI (R2 higher than 0.8). The significance of the study is questioned, and comparison with previous studies to assess improvement is recommended.
Citation: https://doi.org/10.5194/egusphere-2023-2635-RC2 - AC2: 'Reply on RC2', Andrea Porcheddu, 15 Mar 2024
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
382 | 121 | 29 | 532 | 14 | 15 |
- HTML: 382
- PDF: 121
- XML: 29
- Total: 532
- BibTeX: 14
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Andrea Porcheddu
Ville Kolehmainen
Timo Lähivaara
Antti Lipponen
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8923 KB) - Metadata XML