the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
How well can persistent contrails be predicted? – An update
Abstract. The total aviation effective radiative forcing is dominated by non − CO_{2} effects. The largest contributors to the non − CO_{2} effects are contrails and contrail cirrus. There is the possibility of reducing the climate effect of aviation by avoiding flying through ice supersaturated regions (ISSRs), where contrails can last for hours (socalled persistent contrails). Therefore, a precise prediction of the specific location and time of these regions is needed. But a prediction of the frequency and degree of ice supersaturation (ISS) on cruise altitudes is currently very challenging and associated with great uncertainties because of the strong variability of the water vapour field, the low number of humidity measurements at air traffic altitude, and the oversimplified parameterisations of cloud physics in weather models.
Since ISS is more common in some dynamical regimes than in others, the aim of this study is to find variables/proxies that are related to the formation of ISSRs and to use these for a regression method to predict persistent contrails. To find the best suited proxies for regressions, we use various methods of information theory. These include the loglikelihood ratios, known from the Bayes’ theorem, a modified form of the KullbackLeibler divergence and the mutual information. The variables (the relative humidity with respect to ice RHi_{ERA5}, the temperature T , the vertical velocity ω, the divergence DIV , the relative vorticity ζ, the potential vorticity PV , the normalised geopotential height Z and the local lapse rate γ) come from ERA5 and RHi_{M/I}, which we assume as the truth, comes from MOZAIC/IAGOS (commercial aircraft measurements).
It turns out, that RHi_{ERA5} is the most important predictor of ice supersaturation, in spite of its weaknesses, and all other variables do not help much to achieve better results. Without RHi_{ERA5}, a regression to predict ISSRs is not successful. Certain modifications of RHi_{ERA5} before the regression (as suggested in recent papers) do not lead to improvements of ISSR prediction. Applying a sensitivity study with artificially modified RHi_{ERA5} distributions point to the origin of the problems with the regression: the conditional distributions of RHi_{ERA5} (conditioned on ISS and nonISS, from RHi_{M/I}) overlap too heavily in the range 70–100 %, such that for any case in that range it is not clear whether it belongs to an ISSR or not. Evidently, this renders the prediction of contrail persistence very difficult.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(1076 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1076 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere2024385', Anonymous Referee #1, 22 Mar 2024
This paper explores various approaches to improve the prediction of ice supersaturated regions that is provided by numerical weather prediction models. This piece of work is crucial in informing ongoing efforts and trials in attempt to mitigate aviation’s contrail climate forcing. The paper is wellstructured, employs suitable methodology, and presents reasonable results. It aligns with the scope of Atmospheric Chemistry and Physics. I recommend it for publication pending consideration of the following methodological questions and suggestions for improvements:
 [Section 2.1] The description of the MOZAIC/IAGOS dataset might create confusion for the reader. For example, MOZAIC was transferred to the European infrastructure IAGOS in 2011, and the authors stated that only data between 2000 and 2009 was used. If this is the case, why not just name the insitu measurements as "MOZAIC", instead of "MOZAIC/IAGOS"? In addition, what is the rationale for not including more insitu measurements from the recent IAGOS dataset (i.e., between 2009 – present day)?
 [Section 2.1] What is the temporal resolution (between flight waypoints) that is provided by the MOZAIC/IAGOS insitu measurements? Have the authors considered and attempted to minimize the potential for autocorrelation between waypoints?
 [Section 2.2] It is unclear what is the specific product of the ERA5 reanalysis data that was used in this manuscript. Is it the ERA5 high resolution realization (HRES) reanalysis? In addition, what is the spatiotemporal resolution of the ERA5 data that was downloaded and used in this analysis? How does the spatiotemporal resolution of the meteorological data influence the results presented in this paper?
 [Section 2.2] Given that the spatiotemporal resolution of the meteorological data is an important factor that can influence the quality of ISSR and RHi estimates, have the authors considered rerunning the analysis using “model level” data in addition to “pressure levels”?
 [Section 2.2] It is worth including some details on the specific parameterization used to estimate the saturation pressure over ice (p_{ice}), which is required to estimate the RHi. Does the use of different parameterizations, i.e., Sonntag (1994) or Murphy & Koop (2005) lead to differences in the presented results?
 [Section 2.2] A recent opensource repository found that the interpolation method across the vertical level (i.e., linear interpolation, loglog interpolation, or cubic spline interpolation) could lead to differences in the RHi estimates from the ERA5 humidity fields (https://py.contrails.org/notebooks/specifichumidityinterpolation.html). For example, given the nonlinear lapse rate of the specific humidity, a linear interpolation across the vertical level may lead to overestimation of the specific humidity. Therefore, the authors should consider exploring the impact of the interpolation methodology on their presented results.
 [General comment] The authors correctly highlighted that contrail mitigation is most effective when longlived persistent (warming) contrails are avoided. Since contrails forming near the RHi threshold (close to 100%) tend to be shorter lived relative to those formed at higher RHi’s, one potential strategy is to focus on regions where ice supersaturations are notably higher than threshold conditions. For instance, could there be an improvement in the ETS scores if the authors consider raising the threshold for the “predicted contrail formation” to be, say, RHi_{ERA5} > 110%, instead of 100%?
 [Structure] Section 3 of the manuscript includes both the methodology and results, thereby making the section unnecessarily long. The authors should consider separating the methodology and results into two different sections to improve the readability of the manuscript.
Citation: https://doi.org/10.5194/egusphere2024385RC1 
AC2: 'Reply on RC1', Sina Hofer, 15 May 2024
Please find our reply in AC1.
Citation: https://doi.org/10.5194/egusphere2024385AC2

RC2: 'Comment on egusphere2024385', Anonymous Referee #2, 26 Apr 2024
This paper examines the key challenge of accurately predicting ice supersaturated regions (ISSRs) at cruising altitudes using reanalysis data and in situ measurements as ground truth. This paper is a continuation and culmination of some of the authors' efforts and previous research to use dynamical proxies to improve the ability to predict relative humidity with respect to ice. It aims to provide an endtoend statistical framework for better estimation of ISSR formation, focusing on the different statistical design options and their respective performance. The conclusion, although disappointing, is a significant result for the contrail research community.
The paper is well organised, uses appropriate statistical methods and provides plausible and important results. We recommend its publication as it is, with only a minor technical correction:
[Section 3.1] "This means that the likelihood ratio must be greater than 3.85". I assume here that 3.85=0.5/0.13. But if I'm not mistaken, to make ISS more likely than no ISS, we need the posterior odds ratio to be greater than 1.0, which translates into a likelihood ratio threshold of 1.0/0.13=7.69. Since ln(7.69)=2.0, this would be consistent with the threshold on the logit
Here are also some points to consider for further scientific discussion or future research, particularly with regard to the statistical methodology (Part 3):
 [Section 2.1] As mentioned by the other reviewer, it would be interesting to understand how the spatiotemporal resolution of the meteorological data affects the statistical results developed in the Part 3.
 [Section 3.1] It could be interesting to use the same validation framework (split train/test datasets) to estimate the ETS using the individual Bayesian estimators directly as binary classifiers. Section 3.1 could be developed in a very similar way to 3.2, as methods based on KullbackLeibler distance can be used as model/variable selectors (although less frequently than mutual information). Distance between variables is not needed if you are using forward additive selection or backward elimination.
 [Section 3.1] It might be interesting to use slightly more complex Bayesian estimators such as naive Bayes, which tend to work surprisingly well even in a situation where the variables are correlated, provided it is improved by optimising the threshold applied to the posterior output. Accounting for variable correlation with a simplified DAG and simple conditional probability parameterisations is also an option.
 [Section 3.2.2] The models chosen in this section could be automatically determined by forward additive selection or backward elimination using the mutual information as variable selector.
 [Section 3.2.4] Ridge or lasso regularisation could be used to better control the risk of overfitting due to correlated variables.
 [Section 3.2.4] Although it is highly unlikely that it would have changed the results at all, modern treebased machine learning techniques (random forests/LightGBM) could be used as they are designed to deal with nonlinearities, overfitting/correlated variable problems and implicit variable selection at the same time.
 [Section 3.2.4] Optimising the output probability threshold might improve the results slightly (but certainly not significantly).
 [Section 4.2.3] As mentioned by the authors, it is not surprising that adjustment techniques like [Teoh et al., 2022] are not really needed here, as the nonlinear nature of the classifier has learned to infer the implicit bias nonparametrically.
 [General] As mentioned by the other reviewer, it might be interesting to reformulate the problem as a multiclass problem, either using situations of even higher supersaturation or using the ontology developed in [Wilhelm et al., 2020] (no persistent contrails, persistent contrails, most warming contrails). A direct regression framework between reanalysis and insitu relative humidity is also an option.
Citation: https://doi.org/10.5194/egusphere2024385RC2 
AC3: 'Reply on RC2', Sina Hofer, 15 May 2024
Please find our reply in AC1.
Citation: https://doi.org/10.5194/egusphere2024385AC3

AC1: 'Comment on egusphere2024385', Sina Hofer, 15 May 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere2024385/egusphere2024385AC1supplement.pdf
Interactive discussion
Status: closed

RC1: 'Comment on egusphere2024385', Anonymous Referee #1, 22 Mar 2024
This paper explores various approaches to improve the prediction of ice supersaturated regions that is provided by numerical weather prediction models. This piece of work is crucial in informing ongoing efforts and trials in attempt to mitigate aviation’s contrail climate forcing. The paper is wellstructured, employs suitable methodology, and presents reasonable results. It aligns with the scope of Atmospheric Chemistry and Physics. I recommend it for publication pending consideration of the following methodological questions and suggestions for improvements:
 [Section 2.1] The description of the MOZAIC/IAGOS dataset might create confusion for the reader. For example, MOZAIC was transferred to the European infrastructure IAGOS in 2011, and the authors stated that only data between 2000 and 2009 was used. If this is the case, why not just name the insitu measurements as "MOZAIC", instead of "MOZAIC/IAGOS"? In addition, what is the rationale for not including more insitu measurements from the recent IAGOS dataset (i.e., between 2009 – present day)?
 [Section 2.1] What is the temporal resolution (between flight waypoints) that is provided by the MOZAIC/IAGOS insitu measurements? Have the authors considered and attempted to minimize the potential for autocorrelation between waypoints?
 [Section 2.2] It is unclear what is the specific product of the ERA5 reanalysis data that was used in this manuscript. Is it the ERA5 high resolution realization (HRES) reanalysis? In addition, what is the spatiotemporal resolution of the ERA5 data that was downloaded and used in this analysis? How does the spatiotemporal resolution of the meteorological data influence the results presented in this paper?
 [Section 2.2] Given that the spatiotemporal resolution of the meteorological data is an important factor that can influence the quality of ISSR and RHi estimates, have the authors considered rerunning the analysis using “model level” data in addition to “pressure levels”?
 [Section 2.2] It is worth including some details on the specific parameterization used to estimate the saturation pressure over ice (p_{ice}), which is required to estimate the RHi. Does the use of different parameterizations, i.e., Sonntag (1994) or Murphy & Koop (2005) lead to differences in the presented results?
 [Section 2.2] A recent opensource repository found that the interpolation method across the vertical level (i.e., linear interpolation, loglog interpolation, or cubic spline interpolation) could lead to differences in the RHi estimates from the ERA5 humidity fields (https://py.contrails.org/notebooks/specifichumidityinterpolation.html). For example, given the nonlinear lapse rate of the specific humidity, a linear interpolation across the vertical level may lead to overestimation of the specific humidity. Therefore, the authors should consider exploring the impact of the interpolation methodology on their presented results.
 [General comment] The authors correctly highlighted that contrail mitigation is most effective when longlived persistent (warming) contrails are avoided. Since contrails forming near the RHi threshold (close to 100%) tend to be shorter lived relative to those formed at higher RHi’s, one potential strategy is to focus on regions where ice supersaturations are notably higher than threshold conditions. For instance, could there be an improvement in the ETS scores if the authors consider raising the threshold for the “predicted contrail formation” to be, say, RHi_{ERA5} > 110%, instead of 100%?
 [Structure] Section 3 of the manuscript includes both the methodology and results, thereby making the section unnecessarily long. The authors should consider separating the methodology and results into two different sections to improve the readability of the manuscript.
Citation: https://doi.org/10.5194/egusphere2024385RC1 
AC2: 'Reply on RC1', Sina Hofer, 15 May 2024
Please find our reply in AC1.
Citation: https://doi.org/10.5194/egusphere2024385AC2

RC2: 'Comment on egusphere2024385', Anonymous Referee #2, 26 Apr 2024
This paper examines the key challenge of accurately predicting ice supersaturated regions (ISSRs) at cruising altitudes using reanalysis data and in situ measurements as ground truth. This paper is a continuation and culmination of some of the authors' efforts and previous research to use dynamical proxies to improve the ability to predict relative humidity with respect to ice. It aims to provide an endtoend statistical framework for better estimation of ISSR formation, focusing on the different statistical design options and their respective performance. The conclusion, although disappointing, is a significant result for the contrail research community.
The paper is well organised, uses appropriate statistical methods and provides plausible and important results. We recommend its publication as it is, with only a minor technical correction:
[Section 3.1] "This means that the likelihood ratio must be greater than 3.85". I assume here that 3.85=0.5/0.13. But if I'm not mistaken, to make ISS more likely than no ISS, we need the posterior odds ratio to be greater than 1.0, which translates into a likelihood ratio threshold of 1.0/0.13=7.69. Since ln(7.69)=2.0, this would be consistent with the threshold on the logit
Here are also some points to consider for further scientific discussion or future research, particularly with regard to the statistical methodology (Part 3):
 [Section 2.1] As mentioned by the other reviewer, it would be interesting to understand how the spatiotemporal resolution of the meteorological data affects the statistical results developed in the Part 3.
 [Section 3.1] It could be interesting to use the same validation framework (split train/test datasets) to estimate the ETS using the individual Bayesian estimators directly as binary classifiers. Section 3.1 could be developed in a very similar way to 3.2, as methods based on KullbackLeibler distance can be used as model/variable selectors (although less frequently than mutual information). Distance between variables is not needed if you are using forward additive selection or backward elimination.
 [Section 3.1] It might be interesting to use slightly more complex Bayesian estimators such as naive Bayes, which tend to work surprisingly well even in a situation where the variables are correlated, provided it is improved by optimising the threshold applied to the posterior output. Accounting for variable correlation with a simplified DAG and simple conditional probability parameterisations is also an option.
 [Section 3.2.2] The models chosen in this section could be automatically determined by forward additive selection or backward elimination using the mutual information as variable selector.
 [Section 3.2.4] Ridge or lasso regularisation could be used to better control the risk of overfitting due to correlated variables.
 [Section 3.2.4] Although it is highly unlikely that it would have changed the results at all, modern treebased machine learning techniques (random forests/LightGBM) could be used as they are designed to deal with nonlinearities, overfitting/correlated variable problems and implicit variable selection at the same time.
 [Section 3.2.4] Optimising the output probability threshold might improve the results slightly (but certainly not significantly).
 [Section 4.2.3] As mentioned by the authors, it is not surprising that adjustment techniques like [Teoh et al., 2022] are not really needed here, as the nonlinear nature of the classifier has learned to infer the implicit bias nonparametrically.
 [General] As mentioned by the other reviewer, it might be interesting to reformulate the problem as a multiclass problem, either using situations of even higher supersaturation or using the ontology developed in [Wilhelm et al., 2020] (no persistent contrails, persistent contrails, most warming contrails). A direct regression framework between reanalysis and insitu relative humidity is also an option.
Citation: https://doi.org/10.5194/egusphere2024385RC2 
AC3: 'Reply on RC2', Sina Hofer, 15 May 2024
Please find our reply in AC1.
Citation: https://doi.org/10.5194/egusphere2024385AC3

AC1: 'Comment on egusphere2024385', Sina Hofer, 15 May 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere2024385/egusphere2024385AC1supplement.pdf
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML  XML  Total  BibTeX  EndNote  

448  176  27  651  34  14 
 HTML: 448
 PDF: 176
 XML: 27
 Total: 651
 BibTeX: 34
 EndNote: 14
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Klaus Martin Gierens
Susanne Rohs
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1076 KB)  Metadata XML