the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
AERO-MAP: A data compilation and modelling approach to understand spatial variability in fine and coarse mode aerosol composition
Abstract. Aerosol particles are an important part of the Earth system, but their concentrations are spatially and temporally heterogeneous, as well as variable in size and composition. Particles can interact with incoming solar radiation and outgoing long wave radiation, change cloud properties, affect photochemistry, impact surface air quality, change the surface albedo of snow and ice, and modulate carbon dioxide uptake by the land and ocean. High particulate matter concentrations at the surface represent an important public health hazard. There are substantial datasets describing aerosol particles in the literature or in public health databases, but they have not been compiled for easy use by the climate and air quality modelling community. Here we present a new compilation of PM2.5 and PM10 aerosol observations, focusing on the spatial variability across different observational stations, including composition, and demonstrate a method for comparing the datasets to model output. Overall, most of the planet or even the land fraction does not have sufficient observations of surface concentrations, and especially particle composition to understand the current distribution of particles. Most climate models exclude 10–30 % of the aerosol particles in both PM2.5 and PM10 size fractions across large swaths of the globe in their current configurations, with ammonium nitrate and agricultural dust aerosol being the most important omitted aerosol types.
- Preprint
(3647 KB) - Metadata XML
-
Supplement
(2248 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1617', Anonymous Referee #1, 17 Jul 2024
This work presents 1) compilation of the available surface PM2.5 and PM10 observations and their composition around the globe with spatially gridded at 2.5 deg of longitude and ~2 deg of latitudes and temporally averaged for any length of measurement duration, 2) presentation the model evaluation of the simulated PM with the gridded climatology dataset as a methodology for model-data comparison, and 3) identification of the measurement and modeling gaps.
1. Data compilation:
Although I commend the team who obviously has put large amount of effort to collect extensive observational datasets (~15,000 stations and over 20 million observations, per section 3.1) and put them into a common format, such compilation at the present format (coarse spatial gridded resolution, no time resolution) provides little usefulness for any type of air quality studies and rather limited help (surface concentration only, no interannual/seasonal variability) for climate model evaluation.
The authors justify the compilation of the observational data into coarsely gridded climatological data as to characterize the spatial variability of aerosols. I am very puzzled with that purpose. Considering the large heterogeneity of the sources of aerosol and precursor gases and short lifetime of aerosols, such spatial differences across different aerosol regimes and environmental conditions are well expected and have been captured by most global models. The large spatial variability of aerosols has also been observed by numerous satellites in the past 30 years. Therefore, I don’t understand the rationale for spending tremendous effort to generate a dataset with such limited usefulness that hides interannual and seasonal variabilities and not suitable for any air quality related studies.
In addition, the data sources can be better described to at least mention major networks and key individual datasets, describe the similarities and differences in measurement techniques.
The authors have deferred the temporally resolved data and more information to the GHOST dataset which include a subset of the PM data collected in this work, and they “hope that the GHOST effort can expanded to include more spatial variability and be maintained into the future”. This should not be the reason that the temporal variability is not provided in this AERO-MAP compilation, especially such variability might be lost if the hope for the GHOST expansion does not get realized.
2. Model simulation and evaluation method:
The manuscript spends most extensive volume on modeling and comparison with the complied data. A few additional aerosol sources and types have been added to the CAM6 model simulated aerosols, including agriculture dust, coarse mode BC and OC, coarse ash from industrial sources, primary biogenic particles, and ammonium nitrate. However, there are several issues that need to be clarified. It is not described how many years of the model simulation is conducted to compare with the data that are averaged from different time periods from 1986 to 2023, although it is said that the model uses the 2010 emissions. It is also not clear how ash emissions from industrial sources are implemented (are all industrial sectors emitting ash, or just certain sectors?), if all particles are converted to the aerodynamic size and how the conversion is done, etc. At the end, the model evaluation stops at the mode-data comparison, and there is no “lesson learned” to move forward for using the data to constrain and improve the model. Besides, not to show the temporal variability in different regions have really hindered the thoroughness of model evaluation. In addition, there are many places regarding the model that need to be clarified/corrected (see specific comments below).
For all the model vs. observation scatter plots, it is necessary to show regional statistics instead of lumping them together to better assess the regional bias. The correlation coefficients are necessary but not sufficient; the relative bias and RMSE should also be calculated and shown.
3. Measurement and modeling gaps identified:
In the conclusion, the identified gaps for the CESM2 (or is it CAM6?) include 1) model missing 10-60% of the particulate mass due to the lack of nitrogenous particles (ammonium and nitrate) and 2) the poorly understood agricultural dust particles. However, the comparisons are showing the opposite regarding the missing mass and lack of nitrogen aerosols. Almost all the model-data comparison figures are showing that overall, the model simulated aerosol mass and composition, including ammonium and nitrate, are higher than observed, not lower, and the actual ammonium nitrate simulated by the model must be cut in half (section 2.2) to make better agreement with the observations. As for the “poorly understood agriculture dust”, the comparisons of Al (dust proxy) are done for dust emitted from all sources and no agriculture dust is evaluated; that means there is no way to particularly point to the problems in agriculture dust.
Given the above reasons, I recommend not to publish the manuscript in the present form. Major revision should be made to better disseminate the collected dataset and to clarify the modeling and improve the evaluation matrix.
Specific comments:
Line 76: Change ‘but” to “and”.
Line 85-87: “Most climate models exclude 10-30% of aerosol particles…”: this sentence seems baseless. How many climate models have been surveyed to reach such conclusion?
Line 112-113: “However, most of these comparisons include data only from North America and Europe”: This is certainly not true. The comparisons were done with data from the globe without regional preference, although most surface PM data measurements are available from North America and Europe.
Line 161-163: The justification of focusing on climatological mean does not sound. All models should have at least monthly means that are easily to obtain, and assessing aerosol radiative effects and aerosol-cloud interactions certainly requires time-resolved fields, not climatological means.
Line 164: “The climatological mean is obviously less important for extreme air quality events”: It is really not useful for any air quality related studies which requires high spatial and temporal resolution data, not climatological means.
Line 176: How does advertisement at international meetings collect data? It sounds strange.
Line 181-182: Does the PM10 in the compilation include PM2.5, or just PM2.5-PM10 (i.e., particle diameter larger than 2.5 micron and less than or equal to 10 micron)?
Line 184-185: The supplemental dataset 1 only shows the year of the measurements, no information on season/month/day.
Line 186-205: The different techniques and possible issues for each measurement datasets should be documented in the data archive.
Line 206: There is a 38-year time span of the data collected, when the emissions changed significantly around the globe. Therefore, the “climatological mean” is not meaningful.
Line 223: How much of the model-data difference might be attributed to that the model uses 2010 emission while the data spanning as along as 38 years?
Line 224: CAM6 or CAM5?
Line 230-231; CMIP6 emission does not include natural emissions.
Line 240: What crop area data was used?
Line 243: There is no separate evaluation of agricultural dust presented in the manuscript.
General question: Does the model include secondary organic aerosols from anthropogenic, biomass burning, and biogenic VOC oxidations?
Line 269-270: If the model already assume that sulfate and nitrate are fully neutralized as ammonium sulfate and ammonium nitrate, what is the role of simulated NH4+?
Line 278-279: Such adjustment for nitrate (cut in half) does not seem to be warranted. By the same practice, one could have multiplied any number to all model simulated aerosol species for the purpose to best match the observations; this is not called model evaluation.
Line 283: How does model convert sea salt to sodium and chloride? In fact, chloride is not used anywhere in the paper.
Line 294: Should be “one-to-one”, not “on to one”.
Line 301-317, the paragraph about aerodynamic size:
- since the measurements are done with a variety of techniques, how do you first harmonize the aerosol particle size with those different measurements? Clearly, some (probably most) of the technique measures aerodynamic size but others are not.
- Converging geometric diameter to aerodynamic diameter requires the consideration of both particle density and shape factor. In that regard, not only model simulated dust particles should be converted to aerodynamic size but also other aerosol species. But the aerodynamic-equivalent geometric size for these aerosols are not described in the manuscript. Are they considered? What is the equivalent sizes for these aerosols?
Line 353-354: “this dataset presents a huge increase in the amount of data available to the aerosol modeling community”: Do you have any statistics support such statement? What is the amount of data currently available and how much increase does the data compiled in this study offer?
Line 360-361, different measurement durations: It is important to document the durations of each data to understand the difference between the model and data.
Section 3.2 in general: It is not clear what the definition of “uncertainty” is. It seems the term “variability” and “uncertainty” sometimes are used interchangeably. For example, how do you assign the uncertainty values for “within grid variability”, “within year variability”, and interannual variability”? (Figure 1g).
Line 408-409: This sentence is unclear. Are you saying that the model overestimates SO2 compared to the satellite remote sensing SO2, so the model overestimation of PM2.5 over China can be attributed to an overestimation of emission?
Line 410-411: do you see from the CMIP6 emission dataset if the emissions over India and China in more recent years are lower than those in 2010 to support this statement?
Line 414: “Much of the data in those regions are not usually included in compilation of data”: Is there any references to support that statement?
Line 427: What is the default version of the model and what are included?
Line 429: “…but this bias (overestimation of sulfate) was seen in this model”: Is there any explanation of such long history of the model high bias? I wonder why this model has not been improved since the problem has been known for more than 10 years.
Line 439-440: “…(bias) must be due to biomass burning and/or industrial emissions”: What diagnostics have you run to attribute the cause of the bias solely to emissions but not to chemical formation or dry/wet deposition processes?
Line 441: Again, how is the model simulated sea salt converted to Na?
Line 443-444 and Figure 4: If Na represents sea salt, how can it go so far inland? What is the land source of Na? Even though the industrial source of Na is not included in the model, it should be captured by the measurement over land.
Line 449-461 about nitrogen aerosol species: In the model description section, it is assumed that all nitrate aerosols is in the form of ammonium nitrate and all sulfate aerosol is in the form of ammonium sulfate, which contradicts the description in this paragraph that the cations (NH4+) and anions (SO4=, NO3-) are not balanced, although you do not have the thermodynamic equilibrium process to partition the amount of NH4+ to sulfate or nitrate. So, what is the point of the discussion?
Line 462-463: The NO3- from the model compared in Figure 4k and l is reduced in half, per description in line 278-279, right? It should be clarified.
Line 465: It is not clear where the “agriculture regions” are in Figure 4k and l to show the importance of NH4+.
Line 473: Again, the conversion between aerodynamic diameter and geometric diameter depends on the particle composition (density, shape) and is not universal.
Line 476-477: The conversion factor for coarse sea salt can be calculated from sea salt aerosol density and shape factor.
Line 478, 480, and Figure 5: The figure labels are inconsistent. Is it Fig 5a, c, d, or Fig 5a, b, c?
Line 485 and Figure 6a, b: How was the sulfate PM10 calculated? It was not described before.
Line 494: “well simulated” is subjective. What is the criterion for “well simulated”? Please be quantitative and objective.
General comments for section 3.3 and 3.4, mode-data comparisons: The model simulated PM2.5, PM10, and related aerosol species are compared to the compiled observations to show the model performance against the data. However, the effort seems to stop just here; there is no effort to show how the data can be used to “constrain” the model, even though it has been stated “Our goal in this study was to provide observational constraints” (line 358). What should be done next? How can the model be meaningfully constrained by observations?
Line 526: “nitrogen aerosol emissions”: NH4+ and NO3- are secondary aerosols that are not directly emitted but are formed in the atmosphere from the oxidation of their precursor gases.
Figures:
- Figure 1g: What is the x-axis?
- Figure 2a: It is very hard to see the model simulated color over North America, Europe, and Asia because they are all covered by the circles. The map should be made much larger. Currently it is illegible.
- Figure 2b: The spatial gradient from observations are much more smaller than that from model simulations with most data fall into 10-50 micron. Increase the resolution of color contours may help reveal better the spatial gradient from the data. Also, the domain in Fig. 2b is not just East Asia; it covers Central Asia, East Asia, South Asia, and Southeast Asia. Call it Asia is more appropriate.
- Figure 2c (and similar for other scatter plots in this paper): It is hard to distinguish the regions. I suggest plot each region separately to show the characters, and to include correlation coefficients, relative bias, and RMSE on each plot for each region.
- Figure 3: It is not mentioned and used in the text.
- Figure 5: a, c, d on the panel but a, b, c, in the caption (and in the main text). Make them consistent!
- Figure 6b title: It should be PM10, not PM2.5.
- Figure 6, the right panel on the second row: It should be panel d, not c, and the title should be PM10, not PM2.5.
Citation: https://doi.org/10.5194/egusphere-2024-1617-RC1 -
RC2: 'Comment on egusphere-2024-1617', Anonymous Referee #2, 18 Aug 2024
Review Mahowald et al Aero-Map
General comments
The study is a very worthy attempt to gather data on PM and their composition from world wide sites. The authors are praised for the attempt and the work which went into it. However, unfortunately the compilation lacks quite some detail to convince others to use it, and is a bit too vague in the discussion or errors and model-data comparison, to my opinion. Can this be improved?
The period 1986-2023 brings along considerable trends in PM concentrations, with declines of the order of 2-3% per year, amounting to reductions by probably more than 50% in three decades in many stations. That would make any average strongly dependent of the period averaged. A better documentation of the sampling period is needed for usefulness. How large is the temporal coverage of the data in the period used for averaging? How large is the temporal coverage for the chemical composition data put aside the PM data? (if thats not the same, I would recommend to produce different tables or temporal coverage and period for each component to be specified). There is a vague description of taking in more data, if the region is scarcely sampled in the paper. Which data are used, although they do not fulfill the general coverage requirements set out in the paper? Even though the spatial variability is larger than the trend, the comparison in a region is dominated by the emissions in a specific period under consideration. The model data comparison in this paper also demonstrates that regional bias is to be understood and not just the global spatial sd.
A little more cares needs to go into preparing the main output, the yearly averages in csv format: Why do coarse and fine data appear? I thought all is converted to PM10 and PM25 ? It left me wondering if data are doubled. I found for instance PM10, PM25 and coarse and fine for Birkenes, Norway, for slightly different, but overlapping periods. Also station K_pszta. Hard to check throughout for users/me. What is the PM measurement method used, can the PM methods be categorised? That categorisation could contain info on a certain likely bias. What shall the user of the dataset understand from when min year and max year has values of 0, -999? How is the standard deviation to be understood (is that from yearly? means over the period of observation)? Why is EC and BC included, Why OC and OM? Why S and SO4? What is Nap? Why Al, dust and ash? Would it be easier to have a metadata column signalling methods?
The paper was send to ESSD before submission to ACP. I have looked at the many comments from two reviewers there, and the short response to these comments by the authors. The authors stated that all would be considered before re-submission to ACP. Although some changes are introduced to the paper, it is very unclear to me, what has been addressed, and my impression is that not all comments have been taken up. I would like to ask the authors to address all relevant comments from the ESSD review as well here. Its a waste of time for reviewers and readers, if those valuable comments are lost in the transfer to another EGU paper. I understand that the comparison to the model was not fit for ESSD, but that was only part of what the reviewers asked/commented on.
I finally wonder why the annual mean is used and not a monthly mean climatology. Its not so much more data and it would make the data much more useful.
And how much of the PM10+PM25 mass is reconstructable from the composition? I cant find an evaluation of that. Would be useful to underpin that there is some missing component. I actually do not think its nitrate, or spores, or ash… what about water?
Specific comments
The 2x2 aggregated file contains per lat lon box the number of observations and the number of stations. What is number of observations here? I cant find a clear definition of how this is counted.
l86 Most climate models exclude 10-30% of the aerosol particles in both PM2.5 and PM10 size fractions
=> where does this number come from?
l148 : “here we focus on characterizing in observations and models the spatial variability of the surface concentrations “
“ => grammar wrong?
l161 We focus on the spatial distribution of climatological mean, as that is easily obtained from models, and the most important variable for many climate impacts
=> is it really true, that the mean spatial distribution is most important?
l166 Quass et al., 2022 => Quaas et al.
l181 “Some measurement sites measure PM2.5 and coarse (PM2.5 to PM10) aerosols “
=> Do they really measure that, or do they only provide that coarse aerosol info as difference of two measurements ? See also my comment above to coarse and fine PM data.
l183 we included less complete datasets at sites in regions with limited data. =>
What does that exactly mean? Which data are included thus? Please mark in the table.
l184 The time period for different datasets is included in the supplemental dataset 1
=> I only find the time period for PM. Are the composition measurements really done for the same period as for PM?
l196-197: we include measurements of total suspended particulates (TSP) with the PM10, because of the lack of size-resolved data.
=> What does that exactly mean? Which data are included thus? Please mark in the table.
l201: wrt to the definition 1.8 * OC = OM => Why is this mentioned? Why are both OC and OM data in the table?
repeated in l300, can be omitted in one or the other place, anyway I dont understand exactly why this is mentioned.
l202: why include EC and BC? Are they treated differently when comparing to the model?
l216+l261 : Not sure why E3SM is mentioned, can be omitted.
l278 the calculated nitrate aerosol amounts are multiplied by 0.5 to best match the available observations.
=> I dont see why this should be done here, why not removing all bias with a factor in preprocessing?
l299 Table S2: I do not see the value of this table, can be omitted and the info is already, or should be just in the text.
l294: on- to one ….
l303: electrical mobility analysers: when are these used for PM measurements? omit.
l309 and more laces: The use of PM6.9 can not be suggested like this. It really only makes sense for dusty episodes, stations. In the paper it sounds like it should be used in general.
l318/319: For ease of viewing, ….the comparisons should also be shown for different regions in separate figures.
l324: Notice that we include both urban regions and rural or remote sites
=> well, urban sites will create a negative bias for the global model. So it would be needed to identify them. Or rather state, that you do not know how exclude the urban sites.
l333: with in => within ?
l338 comparisonl => comparison
l338 dry vs. moist aerosol mass different inlet geometries => clarify
and in general, are the methods known? Why not specify the most important measurement categories in the table?
l350 and other places: “total mass” (PM) =>
What to you mean TSP? PM10? PM25 or any of them?
l359 prioritizing long term stations with composition data, but in regions with few measurements, we include only PM data, or data collected during field campaigns, which may last only a month or two.
=> what does that mean? Its a vague statement. Can you add a simple categorizing column in the table, which explains why you included a station?
l367 Thus, the dataset described here cannot do a good job of constraining aerosol concentrations that are due to episodic emission events like wildfires or dust in regions without long term datasets. =>
Probably true, but what does the user take from this statement?
l372 “variability contribution to the uncertainties”
=> Is variability an uncertainty? Not per se, I would say. Be more precise.
l373: for within year, with grid
=> for within year, within grid ?
l379 ff. What about decadal trends affecting the spatial variability, if the stations are sampled in different periods?
l385: trends can be neglected? =>
I really dont think so.
l394 there is much more variability across different grid boxes (4-5 orders of magnitude) than in time (up to 50%)
=> “in time” - what does that mean?
l394 As expected, the model contains more spatial variability than the observations,
=> Why expected ?? after colocation with the measurements? or without colocation? I think the comparison should be done after colocation.
=> By the way, where do I see that the model has more spatial variability?
l411 It would be useful to quantify which fraction of the model data points is within factor 3 of the observations.
l413 The dot plots are rather un-insightful. It would be more interesting to show a map of bias.
l425 Why is the correlation coefficient going done for the gridded data? Is that an effect of the smaller weight given to well observed regions?
In general: How were the uncertainties of the component concentrations established?
l540 Unfortunately, there are still very limited data characterizing both the surface concentration, size and composition of aerosol particles
=> How do you get to the statement, that there are very limited data available? When will we have enough data?
l549 We also present a method that is generalizable to other models …
l552 This study has highlighted the value of surface concentration data
=> not sure you present a “method” and that you “highlight the value”. What did you learn using the data and the method, that you did not know before about CESM?
l560 This study also highlights the importance of including all aerosol components into the models,
=> not sure about this one either. Where does it highlight the importance of including all components?
l561 in many places there is between 10-60% of the particulate mass missing, largely due to lack of the nitrogenous particles
=> Where do you show that you miss 10-60% of the mass? The abstract mentioned 10-30%. Anyway, where is the statistics presented in the paper? And why is it largely due to the lack of nitrogeneous particles? What about other unknowns: SOA, water?
Citation: https://doi.org/10.5194/egusphere-2024-1617-RC2
Data sets
AERO-MAP: A data compilation and modelling approach to understand the fine and coarse mode aerosol composition Natalie M. Mahowald, Longlei Li, Julius Vira, Marje Prank, Douglas Hamilton, Hitoshi Matsui, Ron L. Miller, Louis Lu, Ezgi Akyuz, Daphne Meidan, Peter Hess, Heikki Lihavainen, Christine Wiedinmyer, Jenny Hand, Maria Grazia Alaimo, Célia Alves, Andres Alastuey, Paulo Artaxo, Africa Barreto, Francisco Barraza, Silvia Becagli, Giulia Calzolai, Shankarararman Chellam, Ying Chen, Patrick Chuang, David D. Cohen, Cristina Colombi, Evangelia Diapouli, Gaetano Dongarra, Konstantinos Elfetheriadis, Johann Engelbrecht, Corinne Galy-Lacaux, Cassandra Gaston, Dario Gomez, Yenny González Ramos, R. M. Harrison, Chris Hayes, Barak Herut, Philip Hopke, Christoph Hüglin, Maria Kanakidou, Zsofia Kertesz, Zbigniew Klimont, Katriina Kyllönen, Fabrice Lambert, Xiaohong Liu, Remi Losno, Franco Lucarelli, Willy Maenhaut, Beatrice Marticorena, Randall V. Martin, Nikolaos Mihalopoulos, Yasser Morera-Gomez, Adina Paytan, Joseph Prospero, Sergio Rodríguez, Patricia Smichowski, Daniela Varrica, Brenna Walsh, Crystal Weagle, and Xi Zhao https://zenodo.org/records/11391232
Model code and software
AERO-MAP: A data compilation and modelling approach to understand the fine and coarse mode aerosol composition Natalie M. Mahowald, Longlei Li, Julius Vira, Marje Prank, Douglas Hamilton, Hitoshi Matsui, Ron L. Miller, Louis Lu, Ezgi Akyuz, Daphne Meidan, Peter Hess, Heikki Lihavainen, Christine Wiedinmyer, Jenny Hand, Maria Grazia Alaimo, Célia Alves, Andres Alastuey, Paulo Artaxo, Africa Barreto, Francisco Barraza, Silvia Becagli, Giulia Calzolai, Shankarararman Chellam, Ying Chen, Patrick Chuang, David D. Cohen, Cristina Colombi, Evangelia Diapouli, Gaetano Dongarra, Konstantinos Elfetheriadis, Johann Engelbrecht, Corinne Galy-Lacaux, Cassandra Gaston, Dario Gomez, Yenny González Ramos, R. M. Harrison, Chris Hayes, Barak Herut, Philip Hopke, Christoph Hüglin, Maria Kanakidou, Zsofia Kertesz, Zbigniew Klimont, Katriina Kyllönen, Fabrice Lambert, Xiaohong Liu, Remi Losno, Franco Lucarelli, Willy Maenhaut, Beatrice Marticorena, Randall V. Martin, Nikolaos Mihalopoulos, Yasser Morera-Gomez, Adina Paytan, Joseph Prospero, Sergio Rodríguez, Patricia Smichowski, Daniela Varrica, Brenna Walsh, Crystal Weagle, and Xi Zhao https://zenodo.org/records/11391232
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
716 | 165 | 28 | 909 | 28 | 21 | 16 |
- HTML: 716
- PDF: 165
- XML: 28
- Total: 909
- Supplement: 28
- BibTeX: 21
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1