the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The Solar Zenith Angle Impacts MODIS versus CALIPSO AOD Retrieval Biases, with Implications for Arctic Aerosol Seasonality
Abstract. Station observations of surface Arctic aerosol have long shown a pronounced seasonal cycle, with burdens characteristically peaking in the late winter/early spring. Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) aerosol optical depth (AOD) products replicate this seasonality, but passive sensor and reanalysis data products do not. We find that the sub- and low-Arctic seasonality of gridded AOD products from six passive sensors diverges from that of CALIOP data products during the months of September–April, even when controlling for sampling biases. Using collocated CALIOP and Moderate Resolution Imaging Spectroradiometer (MODIS) (Aqua) retrievals, we find that for collocations characterized by low-quality MODIS retrievals, the bias between MODIS and CALIOP strongly depends on the solar zenith angle (SZA), with MODIS AODs showing a 132 % reduction relative to the instrument-mean over a theoretical 0–90° SZA domain. As the fraction of MODIS retrievals flagged as “low-quality” increases with higher SZAs, retrieval quality mediates the relationship between the SZA and dataset biases in gridded products. The dependency is likely the result of cloud-adjacency effects, and likely also affects midlatitude AOD seasonality. Though additional sources of uncertainty in high latitude retrievals remain, the observed dependency likely impacts passive data products’ representations of (sub-)Arctic aerosol burdens in boreal spring and autumn, which are important for understanding aerosol processes in a highly sensitive yet understudied region. This work also contributes to improved understanding and quantification of the effects of viewing geometry on satellite AOD retrievals, which can help constrain aerosol observations and associated forcings, globally.
- Preprint
                                        (5565 KB) 
- Metadata XML
- 
                                    Supplement (1297 KB) 
- BibTeX
- EndNote
Status: closed
- RC1: 'Comment on egusphere-2024-3596', Anonymous Referee #2, 22 Jan 2025
- 
                     RC2:  'Comment on egusphere-2024-3596', Lorraine Remer, 06 Jun 2025
            
            
            
            
                        This manuscript delves deeply into the question of why passive sensors, particularly MODIS Dark Target/Deep Blue aerosol products, do not capture the same seasonal cycle as does CALIOP. The study accounts for sampling biases, solar zenith angle, data retrieval quality flags and includes analysis of situations where one of the sensors returns a ‘zero’, as well as when both sensors report positive AOD. The study shows that assimilation data sets tend to follow the passive sensor seasonal signatures because they are dependent on the passive sensors, so that the results have significant consequences down the line and across disciplines. The authors do a very complete job, examining biases and correlations between the passive and active sensors. The figures are informative and the manuscript is very easy to read. I recommend publication. Although I do want to point out that at the end we learn quite a bit about retrieved aerosol data sets and their biases, but not very much about Earth’s atmosphere or aerosols. I personally would have submitted this to Atmospheric Measurement Techniques, not to Atmospheric Chemistry and Physics. I have no need to remain anonymous. This is Lorraine Remer writing. There are a few things that caught my eye as I was reading. LiDAR and AOD. I come from the passive remote sensing side, so this may just be me, but I could have used a little bit more depth on AOD products from CALIOP, or maybe a little bit more information up front. L30 “also provide vertically resolved extinction profiles”. CALIOP measures backscattering profiles, not extinction. L44 “lidar ratio” it is mentioned here for the first time with no explanation. Perhaps it should be defined? L111 and L112 “CALIOP retrieves backscatter with depolarization at 532 and 1064 nm, with L2 and L3 aerosol data products providing AODs at 532 nm.” Yes. CALIOP retrieves backscatter and depolarization, but then there is a big leap to AOD. Eventually the manuscript does describe the CALIOP processing routines, briefly, and it does mention the possibility of incorrect assumptions of lidar ratio affecting biases and correlation between CALIOP AOD and passive sensors. The authors aren’t amiss here. I just encourage them to consider bringing a little bit more explanation up front. MODIS quality flags. It is no surprise that the sensors develop biases and lose correlation as the number of retrievals move to marginal QA flags (QA = 1). There is confusion in the recommendations for use of data with these QA flags. On this web page: https://darktarget.gsfc.nasa.gov/products/viirs-modis/level-2-product-contents We see the statement: “For Ocean based products we suggest using only QA 2 and 3” But on this web page: https://darktarget.gsfc.nasa.gov/what-are-quality-flags-qa-what-do-they-mean-and-where-can-i-find-them We see the statement: “For ocean products we advise using anything above QA zero” This inconsistency on recommendation is troubling. I, myself, have fallen into the “anything above QA=0” camp and made that recommendation many times. However, even so, the QA=1 designation was put there for a reason. It would be helpful for this paper to describe the quality flags and describe exactly what are the criteria that would create a QA=1 and then ask why so many low QA in Arctic oceans. One of the criterion might be solar zenith angle itself or a proxy for it. Therefore, QA flag and solar zenith angle are not independent factors and the analysis presented in the paper should clearly explain the overlap and consequences of the overlap. Figures like Figure 7 might be pre-ordained if these parameters are not independent, for example. I noticed that the other reviewer had similar questions about the QA flags. It's the quality, not the geometry. The final conclusion is stated on L512-L513 However, where sufficient coverage with high-quality L2 MODIS AODs is available, such retrievals may provide useful information even under very high (>70o) SZAs. The way I read this paper is that solar zenith angle is NOT the reason for the decoupling of passive and active sensor AOD seasonal cycles. It is the QA flags of the passive sensors. Am I wrong? This reinforces the need for a more complete description of what triggers a QA flag to go from “good” to “marginal”. Also the authors might want to think about the title again. It’s not really about solar zenith angle in the end, although it made sense to explore the possibility initially. Validation against ground truth There is none. CALIOP is being used as ground truth but is not. CALIOP must make a leap from measurements of backscattering profiles to integrated extinction using assumptions of lidar ratios based on aerosol typing. L141 states that “Globally, CALIOP AODs have been validated against AERONET…” I was curious about the subset of validation at Arctic sites for both CALIOP and the passive sensors. Is there a quick way of looking at that from literature? I don’t expect such a validation in this paper. Some minor quibbles L296-L297. POLDER is also multi angle L301. What is meant by brightest months of the year. Authors list. Does co-author Levy really want to be listed as Rob Levy and not Robert C. Levy. It makes future searches more difficult. Citation: https://doi.org/10.5194/egusphere-2024-3596-RC2 
- AC1: 'Comment on egusphere-2024-3596', Sarah Smith, 23 Jul 2025
Status: closed
- 
                     RC1:  'Comment on egusphere-2024-3596', Anonymous Referee #2, 22 Jan 2025
            
            
            
            
                        The paper demonstrates a difference between the seasonality of Arctic aerosol loading reported by passive imagery and orbital lidar/surface observations. Lidar and reference observations see maximal aerosol optical depth during winter, with two to four times greater values than during summer. A suite of six passively sensed datasets exhibits the opposite behaviour. The cause of this is explored for the combined Dark Target Deep Blue (DTDB) MODIS product by examining the relative difference between it and CALIOP observations at Level 3. It is shown that the relative difference becomes increasingly negative as solar zenith angle (SZA) increases and that this is concentrated within MODIS observations flagged as lower quality. Some possible explanations for this trend are eliminated, such as CALIOP’s sensitivity changing with SZA. The authors argue that the erroneous seasonality is caused (at least in part) by large SZA during winter resulting in a preponderance of low-quality MODIS retrievals, which have a lower sensitivity to aerosol (and/or systematically underreport AOD) and, therefore, improperly reduce L3 AOD over the Arctic during winter. I recommend this paper for publication after considering some minor points. It was an engaging and interesting read. I think I was already aware of some of the central points – MODIS retrievals are less accurate at large SZA and Arctic variability is poorly captured – but this manuscript is a thorough examination of the topic and is more accessible than the technical reports where the information is currently presented. A selection of minor comments and technical corrections follow for the authors to consider in the event that their submission is revised. - L379: The paper would benefit from a more detailed explanation of what the QA flags denote and how they are derived. It would alleviate my concerns over the absence of low-quality observations over land, which is unexpected as Arctic land is one of the most difficult environments over which to retrieve aerosol. Further, the supplementary figures imply that the main document should only discuss retrievals over sea as the land appears to be virtually unaffected by SZA. Given Rob is on this paper, I’m sure the authors properly understand the quality flags but it may be helpful to briefly explain their derivation. (A flowchart would be lovely as my experience is that MODIS QA is described over several documents, some of which amend previous versions.) My reading is that DB QA is based on variance in the pixels, while Tables C1 and 2 of Levy et al 2016 state that there is only one way for land pixels to be flagged QA=1 (having between 21 and 30 pixels) while ocean has several routes, such that QA doesn’t have a consistent meaning between the two domains. Personally, I’d have dug into the QA bitmasks to see if the SZA effect was constrained to specific channels or surface conditions but that is too much work for a correction.
- I also note point F on page 3 of https://atmosphere-imager.gsfc.nasa.gov/sites/default/files/ModAtmo/Collection_006_Changes_Aerosol_v28.pdf, which seems relevant to the zeroing of AODs over ocean discussed in this paper.
 
- L173: Are you sure the L3 averaging disregards QA? Page 10, paragraph 2 of https://atmosphere-imager.gsfc.nasa.gov/sites/default/files/ModAtmo/ATBD_MOD04_C005_rev2_0.pdf states, “Those retrievals with QAC=3 are assigned higher weights than those with QAC=2 or QAC=1.” Apologies if I missed a later revision.
- L465: While I agree with your point here, I feel that the problem in the L3 data considered is more about producing useful uncertainty estimates to either reintroduce weighting (if my above point is wrong) or fix the existing one. I’ve been to enough AEROSAT discussions to know why the DTDB team is resistant to that approach – and I’m not asking for the authors to apply it here – but I think this is a good opportunity for the authors to discuss what uncertainty information would be needed. The data presented in this study could be used to include an SZA term within the expected error envelope of low-quality data. What validation campaigns or sites would be necessary to properly understand these limitations? When bidding for new infrastructure, it would be useful to be able to point at a direct request from an independent team.
- L284-287: I’m not sure I agree with the wording used here: the reasonableness with which a median represents the sampled population is determined by the distribution of the quantity measured and there isn’t a single sample size that achieves that for all distributions. However, I believe what you were attempting to say is that the number of samples is basically constant through the year for each sensor and, therefore, there is no expectation that the shape of any curve in Fig. 5 has been influenced by the number of observations.
- L345: I’m also not sure I agree with this wording: if one instrument consistently reports much smaller values than the other or one has substantially larger variability, it could still ‘dominate’ the metric. You normalize because the values you wish to evaluate cover an order of magnitude while suffering approximately constant uncertainties, such that a relative metric is more informative of the full range than an absolute one.
- L366-378: The description of the slopes here was difficult for me to understand. When you say “an approximate 97% negative difference in the bias relative to the instrument-mean, from 0 to 90 SZA”, do you mean “if the red line in Fig. 7a were extended across the full range of x, then the difference between its maximal and minimal values is 0.97”? My first guess was that you meant that the ratio between the slope and the intercept was 0.97, but eventually realised that was non-sensical. My problem may have been recalling that the y-axis is a ratio that can be expressed as a percentage. A different framing may be clearer to a reader encountering this data for the first time, such as writing the slope in the form $RD = m * cos\Theta + a$.
- L486-497: I know that [75,79) means $75 \ge SZA < 79$. I do not know what <[22,26) means nor how it differs from [>45,49). My best guess is “At low (< 22) SZAs… Above moderate (>49)…$.
- L384: It occurs to me that it is possible to include binary variables within regression models, such that a simultaneous regression of relative difference against cos SZA and high/low quality could be done in a future study. I’ve never done it myself, but I have seen such regressions applied to polls using party identification and income or age as variables.
- On page 12, you say that the CALIOP data is subsampled to only cells where passive sensor data is available. I am personally curious how these compare to (a) each other and (b) the total population. It doesn’t need to be in the final paper but, if you have the time, I would be greatly appreciate seeing a single plot of the solid yellow lines of Fig. 5 alongside the equivalent for all points in your reply. Their spread would be a simple estimate of the effect of sampling caused by cloud and failed retrievals.
 Technical corrections: - You are inconsistent in hyphenating “low-quality” when used as an adjective.
- L39: The EarthCARE
- L45: I think it should be ‘depends’ as the sentence subject is ‘representation’ rather than ‘AODs’.
- L92: dark target product assumes
- L213: Aerosol
- L248: non-NaN
- L455 retrieves an AOD
 Citation: https://doi.org/10.5194/egusphere-2024-3596-RC1 
- L379: The paper would benefit from a more detailed explanation of what the QA flags denote and how they are derived. It would alleviate my concerns over the absence of low-quality observations over land, which is unexpected as Arctic land is one of the most difficult environments over which to retrieve aerosol. Further, the supplementary figures imply that the main document should only discuss retrievals over sea as the land appears to be virtually unaffected by SZA. Given Rob is on this paper, I’m sure the authors properly understand the quality flags but it may be helpful to briefly explain their derivation. (A flowchart would be lovely as my experience is that MODIS QA is described over several documents, some of which amend previous versions.) My reading is that DB QA is based on variance in the pixels, while Tables C1 and 2 of Levy et al 2016 state that there is only one way for land pixels to be flagged QA=1 (having between 21 and 30 pixels) while ocean has several routes, such that QA doesn’t have a consistent meaning between the two domains. Personally, I’d have dug into the QA bitmasks to see if the SZA effect was constrained to specific channels or surface conditions but that is too much work for a correction.
- 
                     RC2:  'Comment on egusphere-2024-3596', Lorraine Remer, 06 Jun 2025
            
            
            
            
                        This manuscript delves deeply into the question of why passive sensors, particularly MODIS Dark Target/Deep Blue aerosol products, do not capture the same seasonal cycle as does CALIOP. The study accounts for sampling biases, solar zenith angle, data retrieval quality flags and includes analysis of situations where one of the sensors returns a ‘zero’, as well as when both sensors report positive AOD. The study shows that assimilation data sets tend to follow the passive sensor seasonal signatures because they are dependent on the passive sensors, so that the results have significant consequences down the line and across disciplines. The authors do a very complete job, examining biases and correlations between the passive and active sensors. The figures are informative and the manuscript is very easy to read. I recommend publication. Although I do want to point out that at the end we learn quite a bit about retrieved aerosol data sets and their biases, but not very much about Earth’s atmosphere or aerosols. I personally would have submitted this to Atmospheric Measurement Techniques, not to Atmospheric Chemistry and Physics. I have no need to remain anonymous. This is Lorraine Remer writing. There are a few things that caught my eye as I was reading. LiDAR and AOD. I come from the passive remote sensing side, so this may just be me, but I could have used a little bit more depth on AOD products from CALIOP, or maybe a little bit more information up front. L30 “also provide vertically resolved extinction profiles”. CALIOP measures backscattering profiles, not extinction. L44 “lidar ratio” it is mentioned here for the first time with no explanation. Perhaps it should be defined? L111 and L112 “CALIOP retrieves backscatter with depolarization at 532 and 1064 nm, with L2 and L3 aerosol data products providing AODs at 532 nm.” Yes. CALIOP retrieves backscatter and depolarization, but then there is a big leap to AOD. Eventually the manuscript does describe the CALIOP processing routines, briefly, and it does mention the possibility of incorrect assumptions of lidar ratio affecting biases and correlation between CALIOP AOD and passive sensors. The authors aren’t amiss here. I just encourage them to consider bringing a little bit more explanation up front. MODIS quality flags. It is no surprise that the sensors develop biases and lose correlation as the number of retrievals move to marginal QA flags (QA = 1). There is confusion in the recommendations for use of data with these QA flags. On this web page: https://darktarget.gsfc.nasa.gov/products/viirs-modis/level-2-product-contents We see the statement: “For Ocean based products we suggest using only QA 2 and 3” But on this web page: https://darktarget.gsfc.nasa.gov/what-are-quality-flags-qa-what-do-they-mean-and-where-can-i-find-them We see the statement: “For ocean products we advise using anything above QA zero” This inconsistency on recommendation is troubling. I, myself, have fallen into the “anything above QA=0” camp and made that recommendation many times. However, even so, the QA=1 designation was put there for a reason. It would be helpful for this paper to describe the quality flags and describe exactly what are the criteria that would create a QA=1 and then ask why so many low QA in Arctic oceans. One of the criterion might be solar zenith angle itself or a proxy for it. Therefore, QA flag and solar zenith angle are not independent factors and the analysis presented in the paper should clearly explain the overlap and consequences of the overlap. Figures like Figure 7 might be pre-ordained if these parameters are not independent, for example. I noticed that the other reviewer had similar questions about the QA flags. It's the quality, not the geometry. The final conclusion is stated on L512-L513 However, where sufficient coverage with high-quality L2 MODIS AODs is available, such retrievals may provide useful information even under very high (>70o) SZAs. The way I read this paper is that solar zenith angle is NOT the reason for the decoupling of passive and active sensor AOD seasonal cycles. It is the QA flags of the passive sensors. Am I wrong? This reinforces the need for a more complete description of what triggers a QA flag to go from “good” to “marginal”. Also the authors might want to think about the title again. It’s not really about solar zenith angle in the end, although it made sense to explore the possibility initially. Validation against ground truth There is none. CALIOP is being used as ground truth but is not. CALIOP must make a leap from measurements of backscattering profiles to integrated extinction using assumptions of lidar ratios based on aerosol typing. L141 states that “Globally, CALIOP AODs have been validated against AERONET…” I was curious about the subset of validation at Arctic sites for both CALIOP and the passive sensors. Is there a quick way of looking at that from literature? I don’t expect such a validation in this paper. Some minor quibbles L296-L297. POLDER is also multi angle L301. What is meant by brightest months of the year. Authors list. Does co-author Levy really want to be listed as Rob Levy and not Robert C. Levy. It makes future searches more difficult. Citation: https://doi.org/10.5194/egusphere-2024-3596-RC2 
- AC1: 'Comment on egusphere-2024-3596', Sarah Smith, 23 Jul 2025
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 822 | 106 | 24 | 952 | 48 | 22 | 41 | 
- HTML: 822
- PDF: 106
- XML: 24
- Total: 952
- Supplement: 48
- BibTeX: 22
- EndNote: 41
Viewed (geographical distribution)
| Country | # | Views | % | 
|---|
| Total: | 0 | 
| HTML: | 0 | 
| PDF: | 0 | 
| XML: | 0 | 
- 1
 
 
                         
                         
                         
                         
                 
                 
                 
                 
                
The paper demonstrates a difference between the seasonality of Arctic aerosol loading reported by passive imagery and orbital lidar/surface observations. Lidar and reference observations see maximal aerosol optical depth during winter, with two to four times greater values than during summer. A suite of six passively sensed datasets exhibits the opposite behaviour. The cause of this is explored for the combined Dark Target Deep Blue (DTDB) MODIS product by examining the relative difference between it and CALIOP observations at Level 3. It is shown that the relative difference becomes increasingly negative as solar zenith angle (SZA) increases and that this is concentrated within MODIS observations flagged as lower quality. Some possible explanations for this trend are eliminated, such as CALIOP’s sensitivity changing with SZA. The authors argue that the erroneous seasonality is caused (at least in part) by large SZA during winter resulting in a preponderance of low-quality MODIS retrievals, which have a lower sensitivity to aerosol (and/or systematically underreport AOD) and, therefore, improperly reduce L3 AOD over the Arctic during winter.
I recommend this paper for publication after considering some minor points. It was an engaging and interesting read. I think I was already aware of some of the central points – MODIS retrievals are less accurate at large SZA and Arctic variability is poorly captured – but this manuscript is a thorough examination of the topic and is more accessible than the technical reports where the information is currently presented. A selection of minor comments and technical corrections follow for the authors to consider in the event that their submission is revised.
Technical corrections: