Intercomparison of MAX-DOAS, FTIR and direct sun HCHO vertical columns at Xianghe, China
Abstract. MAX-DOAS (Multi-AXis Differential Optical Absorption Spectroscopy), direct sun DOAS (DS) and FTIR (Fourier Transform InfraRed) measurements are considered nowadays as reference data for the validation of HCHO satellite observations. Recognizing their strengths and limitations, as well as evaluating their consistency, is crucial for generating robust and reliable validation datasets. So far, only a handful of studies have explored the complementarity between MAX-DOAS and direct sun FTIR HCHO measurements. Here we take advantage of the presence of a MAX-DOAS spectrometer, incorporating a direct sun viewing mode capability, and an FTIR instrument operating in parallel at the Xianghe site (39.75° N, 116.96° E, China), to compare the retrieved HCHO vertical columns and investigate in detail the reasons for the observed differences. First, we compare the UV and IR HCHO vertical columns in the direct sun geometry, for which the uncertainty due to the light path is negligible. We find an excellent agreement between the measurements obtained in both wavelength ranges, with a median difference of less than -0.5 x1015 molec/cm2 (-6 %). Second, the MAX-DOAS data are compared to the DS and FTIR ones. The study addresses the impact of using different MAX-DOAS retrieval methods all implemented within the FRM4DOAS centralized processing facility, and discusses differences related to the vertical sensitivity of the measurements in each geometry.
The MAX-DOAS HCHO columns correlate well with the direct sun DOAS and FTIR data, but have a tendency to underestimate them by -22 % and -20.8 % respectively. This bias can be reduced to 1 % when taking properly into account the different a-priori profiles and the respective vertical sensitivities of the MAX-DOAS and FTIR measurements. If the comparison is restricted to the 0–4 km altitude range where MAX-DOAS measurements have their best sensitivity, differences are further reduced with a bias of about -4.6 % for the original comparison and of 2.5 % after taking into account the respective vertical sensitivities and a priori profiles. The underestimation in the retrieved MAX-DOAS vertical columns (VCDs) is shown to be due to the choice of the a priori profile, which neglects the free-tropospheric contribution (above 4 km), where the MAX-DOAS has no sensitivity. These results suggest that improvements to the current FRM4DOAS MAX-DOAS HCHO retrievals are possible.
We investigate whether the underestimation of the MAX-DOAS tropospheric VCDs can be reduced by using more appropriate a priori profiles, based on the CAMS and TM5 chemical-transport models (CTMs). We illustrate the bias reduction with respect to our reference direct sun data. The impact is different depending on the season. The use of CTM-based a priori profiles has a positive impact on the retrieved VCDs in all seasons except winter. When restricting the comparison to the 0–4 km altitude range, the impact of the a priori profile is only significant in the winter period, also leading to a degradation of the agreement with FTIR data. The improvement of the agreement between MAX-DOAS and FTIR data is thus mainly related to a better handling of the free-tropospheric part of the profile, smaller in winter than in other seasons.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Summary of the manuscript
This study presents an intercomparison of formaldehyde (HCHO) vertical columns retrieved from FTIR, MAX-DOAS, and direct sun (DS) DOAS measurements at the Xianghe station in China. The MAX-DOAS, and direct sun (DS) are measured using the same instrument. The work aims to assess the consistency of these ground-based techniques and to some extent evaluate the performance of the current FRM4DOAS MAX-DOAS retrieval system. The main goals of the study are: (1) assess the quality of the MAX-DOAS HCHO products currently delivered by the FRM4DOAS system and (2) revisit the HCHO retrieval approach used in the system to further improve its accuracy.
The authors first compare UV and IR DS HCHO VCD, finding good agreement. Comparisons between MAX-DOAS and DS/FTIR show that MAX-DOAS tends to underestimate HCHO by about 20%. They mentioned that when differences in vertical sensitivity and a priori profiles are accounted for, the bias is substantially reduced and further when the comparison to the 0-4 km range, where MAX-DOAS is most sensitive, also improves agreement. The study also investigates the impact of using chemistry transport model (CAMS and TM5) profiles as a priori in MAX-DOAS retrievals. Results show that this approach improves agreement in most seasons by better representing free-tropospheric HCHO contributions, although wintertime comparisons degrade slightly. Overall, the work demonstrates that MAX-DOAS HCHO underestimations primarily arise from limited sensitivity above 4 km and the choice of a priori profiles. The findings highlight the importance of harmonizing retrieval strategies across ground-based networks.
Major Comments
The manuscript provides valuable intercomparisons, especially because they incorporate the UV and IR Direct sun observations. However, the analysis is limited to a single site (Xianghe). The authors should discuss representativeness of suburban/urban conditions and how conclusions may differ in other regions, e.g., remote low HCHO levels and/or highly polluted regions. I do not recommend expanding the analysis to more sites but mentioning/reference how additional collocated sites (NDACC/FRM4DOAS) would strengthen the broader applicability of the findings.
Also, the authors should more explicitly highlight how this study advances harmonization and validation efforts. For example, my understanding is that MMF does not consider a correction factor for O4, is that correct?, while MAPA does. Have you seen the effect of using the same aerosol extinction profiles in both retrievals? Right now it is hard to see if aerosol is the cause of some of the discrepancies, what would authors recommend for FRM4DOAS given that MMF and MAPA results differ in too many cases as mentioned in the manuscript? An overall conclusion, suggestions, and guidance for improving the FRM4DOASr MAX-DOAS products is missing.
A central result is that the MAX-DOAS underestimation occurs from the choice of a priori, but the discussion of using CAMS/TM5-based priors remains somewhat unclear. The authors should clarify: how much of the improvement reflects a real reduction in bias versus an artificial correction; whether reliance on CTM priors risks transferring model biases; and the extent to which these results can be generalized to other sites. Furthermore, both models used here are low spatial resolution (> 80 km), which will not capture heterogeneity from local sources. The inclusion of a high resolution model, in particular if using a single site for case study would benefit the paper, or a thorough information about the limitations of this is warranted.
The degradation of agreement in winter when CTM priors are applied is mentioned but not fully explained. The authors should provide a stronger discussion of possible causes (e.g., reduced MAX-DOAS sensitivity, low HCHO levels, model representation issues rather than treating this as a minor limitation.
When harmonization (Rodgers & Connor approach) is applied, mean biases improve but regression slopes/intercepts sometimes degrade. This needs clearer explanation to avoid confusion about whether retrievals are actually improved and whether the smoothing process is actually needed, especially given the complexity of all the transformation variables.
Several figures (especially in the Appendix) are dense and difficult to interpret. Simplification is warranted. The manuscript is already quite long, and the large number of complex figures makes it challenging to follow. I recommend retaining only those figures that clearly support the main findings and improve readability.
Since one of the main motivations is satellite validation (e.g., TROPOMI, GEMS, TEMPO, etc), the manuscript should discuss more concretely how improved MAX-DOAS retrievals will influence satellite bias assessments and network harmonization. This would enhance the relevance and impact of the study.
The comparison between FTIR and direct sun DOAS is limited to a few months, even with a gap, within a single year. I recommend to include in the conclusions that long-term comparisons are still warranted to test stability over long-term.
Specific comments
P1, L10: The median difference is reported as a negative number, but without further explanation this sign is not meaningful. I recommend a short explanation and also include uncertainty/variability among all quantitative results.
P1, L12: For readers unfamiliar with FRM4DOAS, the reference is unclear. Please define FRM4DOAS or use a more general description.
P1, L15: When discussing the underestimation (-22% and -20.8%), please include the associated variability. Are these values significantly different, or can they be summarized as a ~20% underestimation?
Abstract: The distinction between the two bias reductions (to 1% after accounting for a priori/AKs vs. 2.5% after restricting to 0–4 km) is not clear. These paragraphs appear redundant or inconsistent. Please clarify the difference in approach and interpretation.
Abstract / Introduction: It is stated that MAX-DOAS has no sensitivity above 4 km, yet the underestimation is attributed to the a priori. This is contradictory. If improved agreement comes only from using better a priori profiles, then the improvement is not due to additional information from MAX-DOAS but rather the imposed prior. Please clarify.
P2, L26: Define what is meant by “positive impact” when describing the use of CTM-based priors.
Text (paraphrased, P2):
“When restricting the comparison to the 0–4 km altitude range, the impact of the a priori profile is only significant in the winter period, also leading to a degradation of the agreement with FTIR data. The improvement of the agreement between MAX-DOAS and FTIR data is thus mainly related to a better handling of the free-tropospheric part of the profile, smaller in winter than in other seasons.”
This sentence is unclear. Does it mean that the comparison worsens in winter and improves during the rest of the year? Please rephrase to make the seasonal dependence explicit.
P4, L102, Why do you mean by “state-of-the art”, I recommend removing that.
P4, L106: Include the altitude of the Xianghe station.
P5, L132: Please explain why the UV channel stopped operating in 2018. Was there a technical failure or another reason? Overall, the sentence “The Xianghe MAX-DOAS measurements cover the period…. for the UV channel. The VIS channel continued to work until August 2022” is not clear because before it is mentioned that “It is a dual channel system composed of two grating spectrometers covering the UV and visible”. What happened after 2022?, is the instrument currently working and be used for future satellite validation?
P5, L147: Spell out acronyms MMF and MAPA at first mention.
P8 (a priori column): Please justify why 8.4 x 1015 molec/cm2 is used as the a priori vertical column value.
P8, L205: Correct “1rst” to “1st.”
Figure 2: How is the direct-sun averaging kernel in panel (b) estimated? Is it the total column AK? If so, please state this explicitly, and consider also including the FTIR total column AK for comparison.
Section 2.3: If the FTIR instrument is mainly operated for TCCON-like observations, what is the effective time resolution of HCHO retrievals?
Section 2.4: Given the coarse spatial resolution of the models, please describe in more detail how a priori profiles are extracted at the Xianghe site. A clearer explanation of the interpolation method would be helpful.
Section 3.1.1: It appears averaging kernels are not applied in the FTIR vs MAX-DOAS DS comparison. Please confirm if this is correct, and if so, justify why it is not necessary to account for the AK differences.
Section 3.1.1 (reference spectrum): Please describe when and how the MAX-DOAS DS reference spectrum was derived. How sensitive are the results to the season/time of year chosen for the reference?
Figure 3: Was any filtering applied to the quantitative correlation analysis? Please clarify.
P7, L378: The direct-sun DOAS (DS) is chosen as the main comparison reference for MAX-DOAS in Sect. 3.1.2, although FTIR offers a longer overlap period. Please elaborate on the reasoning for prioritizing DS over FTIR in this section.
P12, L271: It is mentioned that a fixed reference spectrum is used for the entire DS period, but also that using a season-specific reference spectrum reduces the fit residual while increasing the uncertainty. This seems counterintuitive, could you explain why this occurs?
P13, L311. It is mentioned that FTIR AK peaks around 10 km and is about 0.8 at the surface, what would it mean a value of 2 around 10km?
P14, L335: Improve caption of Figure A4. What is the difference between each subpanel using the same model?
P17, L374: I suggest removing Lines 374-375 as this does not contribute to the findings ans the paper is already long.
P17, L378: It is interesting that authors have chosen UV DS for the comparison with MAX-DOAS because the FTIR covers a longer time span, please clarify why this was chosen.
P17, L403: What reasoning or analysis is carried out to mention that Pandora direct sun product overestimates HCHO? Or how did you derive this conclusion?. It is mentioned that “difference found in the latter study is larger than what a free tropospheric HCHO could explain” but not explanation.
P19, L418: Please clarify why partial columns up to 4 km were specifically chosen.
P19, L431: Define H75 when first introduced.
Section 3.2. Profile comparisons. I recommend revising this section thoroughly for clarity. At the moment, it is very difficult to follow, especially with the large number of figures in the appendix. It would be better to streamline the text and focus on the most important points.
Figure 5. The figure shows profiles from MAX-DOAS (MMF and MAPA) up to 4km. I am a bit confused, does the retrieval of MAX-DOAS for both MMF and MAPA are carried up only up to 4 km?. Typically, retrievals should be carried up with more layers and when no information is coming from the observations a priori information would be used.
P23, L508: It appears that the median bias decreases with smoothing; however, as noted, the overall correlation worsens. I recommend including the uncertainty or error (e.g., standard deviation) associated with the bias, since the variability also seems to increase with smoothing. In addition, Figure 9 indicates that smoothing has a negative effect in winter. Based on these points, please discuss whether smoothing is still necessary or justified for future studies.
Figure 7: How was the DS averaging kernel derived? Please describe the method in the text.
Figure 8. What is the difference between bias and (med) in the text for each subplot?
Table 4 caption: Clarify what “original” vs. “smoothed” means. Does “original” mean without smoothing, and “smoothed” mean extended to 100 km and then convolved with FTIR AKs? Please make it explicit.
P9, L509 / Figure 8: It is stated that smoothing improves the bias, but this is not convincingly shown in Fig. 8b. While the median bias appears to improve, the regression slope degrades, and at high columns MAX-DOAS underestimation is worse while at low columns an overestimation appears. This suggests the improved bias is an artifact of offsetting errors rather than a genuine improvement. The seasonal plot in Fig. 9 (DJF) makes this issue more evident. Please provide an explanation and discuss whether smoothing is still necessary or appropriate for future intercomparisons.
Section 3.1.1 (sensitivity): Please clarify the importance (or lack thereof) of differences in sensitivity between direct sun IR and UV retrievals. This may be relevant to interpreting the results.
P30, L650: Please clarify what the reported “–6%” refers to. As written, it is not clear whether FTIR or DS DOAS values are lower. Specify explicitly which dataset underestimates the other.