Assimilation of ground based lidar and ceilometer observations of aerosols from the European E-Profile network into ECMWF's Integrated Forecasting System (IFS-COMPO, CY49R1)
Abstract. The Integrated Forecasting System with its extension for atmospheric composition (IFS-COMPO) provides global forecasts of atmospheric trace gases and aerosols for the Copernicus Atmosphere Monitoring Service (CAMS). The present system constrains aerosol concentrations by assimilating aerosol optical depth (AOD) from different satellites. Here, we explore the possibility of assimilating, in addition, ground-based lidar and ceilometer observations from the European E-Profile network. The system performance is evaluated by comparison to non-assimilated E-Profile stations, AOD observations from Aeronet, and aerosol surface concentrations from AirBase. Assimilation of E-Profile data significantly reduces biases and root mean square errors (RMSE) of model-equivalent vertical profiles of the attenuated backscatter coefficient. Without assimilation of E-Profile, surface concentrations of particles smaller than 2.5 μm (PM2.5) are frequently overestimated during summer, while corresponding concentrations of particles smaller than 10 μm (PM10) tend to be underestimated. Assimilation of E-Profile can reduce the RMSE of PM2.5 by up to 50 % and of PM10 by up to 10 %. Since the present analysis system uses the total aerosol mass mixing ratio as control variable, it cannot simultaneously reduce the positive PM2.5 bias and the negative PM10 bias. It typically reduces the PM2.5 bias at the expense of PM10, since fine particles make the dominant contribution to the optical cross sections per mass. Tests of different assimilation-system configurations reveal that the best overall performance is obtained by treating optical properties of dust with a spheroid model, suppressing vertical correlations in the background error covariances, and applying a relatively aggressive cloud mask.
Competing interests: Co-author Samuel Rémy is a member of the editorial board of Geosci. Mod. Dev.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
This paper presents numerical experiments on the assimilation of E-Profile data into CAMS. E-Profile is a well-populated network of ceilometers with what appears to be a robust and well-thought-out data distribution system, making it eligible for the Benedetti et al. (2018) critieria of being "fast, accessible, and well-documented" for use in operational aerosol assimilation. And the good news is that, if taken from a purely statistical point of view, the authors show that the assimilation of E-Profile data is beneficial to the analysis of other variables such as PM2.5. This is in itself important: assimilating one set of parameters (here ceilometer-derived attenuated backscatter) helps the analysis of a fully independent set—in this case, PM2.5. If this is the exit criteria, done. The authors and study participants have done their job and have a result that is quite worthy of publishing in GMT.
But (and in data assimilation there is always a "but"), while the authors show a positive impact for region-wide error statistics, it is unclear to the reader what is going on—and in particular, where the impact is relative to significant events. I very much appreciated the appendices, but even here specifics seem to be unclear. I can understand any possible reticence of the authors to add more details, as data assimilation investigations easily lead to endless rabbit warrens that can go in unexpected directions (think Watership Down). But I think there are many things that can be made clear through more direct language, and also some additional discussion, in particular of their example cases. I don’t expect the authors to perform any major reruns of their analysis, but what I suggest can take a little bit of time without being particularly onerous or lengthy.
The problem at hand is that data assimilation is not unlike a half-filled water balloon. You constrain it somewhere, and based on the workings of the model, things can pop out in other places. For CAMS, as the authors point out, the baseline is AOD assimilation. E-Profile data modified this baseline through the instruments' and the model's attenuated backscatter, which is an underdetermined parameter. There are lots of ways to distribute an increment in attenuated backscatter in the model, and yet there is no discussion whatsoever of the 4D-Var adjoint. Based on my previous conversation with Angela Benedetti, the adjoint for aerosol is quite simple (as it should be). Nevertheless, for the case studies, I would strongly recommend adding direct details of what is changing in the model. I am guessing the aerosol speciation fraction is not changing. If so, it would help if it was stated clearly. If it is, how so? Details of how the AOT error changes would also be helpful. There are many AERONET sites in Europe, and I found some nearby for the two test cases shown.
It is pretty trivial to make your model look like your observation, but the real question is to what extent you break something else in the process. Returning the the half filled water balloon analogy, they have optimized for pan-European bias and RMSE. But we don’t know what is happening at sites and more significant events. We then have a long throw from two incomplete case studies to domain-averaged bias and RMSE. This is equivalent to a forecast contest when you guess the next day's high temperature ahead of a cold front. Get the timing of the front wrong and you have lost. But statistically, if you pick the mean temperature climatologically, you have a good shot at winning—not in spirit, mind you, but by the metrics at hand. So I, as a reader, would appreciate a few more statistics—at the very least for picking more significant versus background events, event skill scores, and/or a PDF of sites that see improvement versus sites that may be diminished.
Some of what is going on is somewhat obfuscated in the appendices. There is a great deal of discussion on "updates to the optical model," but in the context of this work, all they need to say is, "We essentially changed the lidar ratio for the near-infrared from X to Y." Such a change could easily result in a big change/improvement in attenuated backscatter and vertical distribution without changing AOD. But it would help if they showed that. Likewise, adding a VIIRS image and an AOT plot would be greatly beneficial. Maybe even time-height cross-sections for Figures 2 and 3 would make the authors' case more effectively.
Anyways, I did enjoy reading the paper, and it made me think. These comments are offered out of admiration for the work. I leave it to the authors to decide how far they want to take this.
Specific comments.
Ln 6: AERONET, not Aeronet.
Figure 2&3: Figure text is way way too small. Just a suggestion, can you please add as visible dots the model midpoints to the profiles. Also, maybe add below a timecodes of the baseline ceilometer data? Lastly, please revise before the typesetters ask you to. Also, please add the change in AOD for these cases.
Section 2.1: I don’t mean to extend the paper, but I would appreciate a few more details on the assimilation process up front in Section 2, instead of having to rummage through the supplemental materials. The authors note, on one hand, that the model is at full resolution, but the minimization is at a lower resolution. I would appreciate the authors being a bit more specific on these points in the main body.
Where this is going is that their assimilation experiment is using high-density ceilometers, but what is the nature of the data assimilation resolution versus, say, meso-alpha features that may be important in Europe, where terrain variability can be significant? Again, I am not asking for a full analysis, just a little context for the experiment and what it captures and what it doesn’t. For example, the use of wavelets seems to be important, but the main body and the referenced appendices are quite thin on this method.
Section 2.2.1 / Figure 1: Comparing the left and right panels, it looks like most (but not all) of the stations for validation are also collocated with stations for assimilation—or are so close as to make no difference. How are you holding back data then? Temporal removal? A bit more description here would be appreciated.
Section 3.1/Figure 2 &3: I very much appreciate case studies—they are where you can learn what works and what doesn’t. I recommend significant expansion here, not only in the figures (e.g., speciated AODs would really help), but also by adding a time element. The case you showed for September 6, 2023, is on the backside of a big event, it seems, based on my own analysis. So, that site was hit with a big event on the 4th and 5th and cleared out on the 6th. But CAMS did not clear it out. Hence, E-Profile made the beneficial correction, but you never really know that unless you look at a bit of a timeseries. This kind of context helps the reader really understand what is happening four-dimensionally in the model. Oh, and please increase font size before the typesetter asks you.
Figure 4 and 5. With the colorbar it is sometimes hard to see what is going adding on. I suggest adding line plots for say 3 levels (surface, 1 km, 3 km?). You could also add individual stations for a good middling and low performers.
Section 3.2: Maybe a bit more discussion on what regions benefitted the most/least and a discussion why you think that is? You mention you think it is because some regions have higher/lower quality instruments (e.g., line 200), but does any of this also correlated with meteorological or terrain features? If you think it is instrument related, shouldn’t that be one of the core conclusions on what kind of instruments are most appropriate? Also, the discussion of PM2.5 bias (Lines 210) again leads to a rabbit’s warren on optical properties. So again, what ultimately is the lidar ratio you are deriving here? Does it make sense?
Figure A1. Maybe make (e)-(h) on a log scale?
Figure A3. Can you also please provide the baseline lidar ratio you are using for spherical or nonspherical (your preference), But there are factors of 2 floating of around and I am curious where this system sits relative to field observations.
D4: Perhaps a bit more discussion in the impact of AOD? Given the changes in Figure D3, I would imagine the AOD change would be huge. Can you give specific numbers for this? The mean bias plot (D4) would necessarily be of low magnitude because the mean AODs are low. How about for significant events? Maybe break it up between, low, middle, high AODs? The wavelet analysis finding is interesting. Care to elaborate?