the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Identifying Lightning Processes in ERA5 Soundings with Deep Learning
Abstract. Atmospheric environments favorable for lightning and convection are commonly represented by proxies or parameterizations based on expert knowledge such as CAPE, wind shears, charge separation, or combinations thereof. Recent developments in the field of machine learning, high resolution reanalyses, and accurate lightning observations open possibilities for identifying tailored proxies without prior expert knowledge.
To identify vertical profiles favorable for lightning, a deep neural network links ERA5 vertical profiles of cloud physics, mass field variables and wind to lightning location data from the Austrian Lightning Detection & Information System (ALDIS), which has been transformed to a binary target variable labeling the ERA5 cells as cells with lightning activity and cells without lightning activity. The ERA5 parameters are taken on model levels beyond the tropopause forming an input layer of approx. 670 features. The data of 2010–2018 serve as training/validation.
On independent test data, 2019, the deep network outperforms a reference with features based on meteorological expertise. SHAP values highlight the atmospheric processes learned by the network which identifies cloud ice and snow content in the upper and mid-troposphere as very relevant features. As these patterns correspond to the separation of charge in thunderstorm cloud, the deep learning model can serve as physically meaningful description of lightning.
Depending on the region, the neural network also exploits the vertical wind or mass profiles to correctly classify cells with lightning activity.
Status: final response (author comments only)
-
CEC1: 'No compliance with the policy of the journal', Juan Antonio Añel, 09 Jul 2024
Dear authors,
After checking your manuscript, we have found several issues regarding compliance with our journal's Code and Data Availability policy. You have done good work regarding sharing the code. The only problem here is that, in the corresponding section of the manuscript, you link a GitHub repository. GitHub is not suitable for long-term archival of assets for scientific research. GitHub itself states it. You have a Zenodo repository (http://dx.doi.org/10.5281/zenodo.10899180). You must cite this Zenodo repository instead of GitHub in the Code Availability section. Please, note this for potentially reviewed versions of your manuscript.
We are more concerned that you have not shared the ALDI data or the merged dataset you use. We must clarify if you qualify for an exception to our policy regarding sharing the data. In this way, we would expect that you had shared the merged dataset that you mention in the manuscript; as it is not the original ALDI data, I understand that this should be possible. Anyway, we need some additional evidence or justification about why the ALDI data can not be shared: laws, regulations, a license, etc.
Finally, a minor issue: note that in your instructions for using the model, you provide indications on the dependency on Openjdk8, and for installation, you provide a command that is only for Debian-based operative systems. Although obvious, you could want to modify it by simply mentioning the dependency.Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2024-1718-CEC1 -
AC1: 'Reply on CEC1', Gregor Ehrensperger, 15 Jul 2024
Dear Dr. Añel.
Thanks for your helpful feedback.
- Point 1: This has been addressed and will be updated in the revised version of the manuscript.
- Point 3: This also has been addressed and is covered in the README.md of the accompanying GitHub repository.
- Point 2: We refer to the ALDIS product site [1]. Prof. Georg Mayr (Co-Author) has access to the data through a research cooperation with Dr. Wolfgang Schulz (Head of Department, OVE Service GmbH - ALDIS) [2].
Best regards,
Gregor Ehrensperger
[1] https://www.aldis.at/en/products/aldisexpert/
[2] https://www.aldis.at/en/contact/Citation: https://doi.org/10.5194/egusphere-2024-1718-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 16 Jul 2024
Dear authors,
Regarding the ALDI data, you need more than just referring readers to a third-party webpage. You use this dataset as an integral part of your work, and therefore, we need it to be published. Moreover, you state that you actually use a new merged dataset. At a minimum, you have to publish the new merged dataset (which I understand you own) in an acceptable public repository according to our policy.
It is imperative that you take the initiative to coordinate with the owners of the ALDI dataset for its publication in a new repository. This is particularly important as you intend to use it for the publication of your submitted manuscript. It is not acceptable for readers wishing to replicate your work to point them out to an unreliable webpage that does not have the data and for which they could get access denied. The data must be public and in trustable repositories to avoid precisely these issues.
Also, given that you do not use the ALDI dataset directly in your work, it would be good if you discussed with the distributors of the ALDI dataset its publication in an acceptable repository for scientific publication.
Therefore, it is a condition for the acceptance of your manuscript for publication in GMD that you publish the new merged dataset that you use in your work. I must emphasize that failure to do so will result in the rejection of your manuscript.Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2024-1718-CEC2 -
AC2: 'Reply on CEC2', Gregor Ehrensperger, 17 Jul 2024
Dear Dr. Añel,
Thank you for emphasizing the importance of making this part of the data available.
Our primary contact for the ALDIS dataset is currently on vacation this week.
However, we are actively working to find a solution and will coordinate with him once he is back in the office.We will keep you updated on our progress and aim to resolve this as swiftly as possible.
Best regards,
Gregor EhrenspergerCitation: https://doi.org/10.5194/egusphere-2024-1718-AC2 -
AC3: '2nd Reply on CEC2', Gregor Ehrensperger, 06 Aug 2024
Dear Dr. Añel,
Thanks to the permission of Dr. Wolfgang Schulz (OVE Service GmbH - ALDIS) we have published the aggregated lightning data from the ALDIS network used in our manuscript. Now both data sets used - ERA5 and ALDIS - are publicly available. Thanks for insisting on making our research easier to reproduce!
- The transformed ALDIS data is available at [1].
- The READMEs included with the source code [2] accompanying our paper have been updated accordingly.
- The revised version of the paper will refer to [1] for obtaining the transformed ALDIS data set.Best regards,
Gregor Ehrensperger
[1] https://doi.org/10.5281/zenodo.13164463
[2] https://github.com/noxthot/xai_lightningprocessesCitation: https://doi.org/10.5194/egusphere-2024-1718-AC3
-
AC2: 'Reply on CEC2', Gregor Ehrensperger, 17 Jul 2024
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 16 Jul 2024
-
AC1: 'Reply on CEC1', Gregor Ehrensperger, 15 Jul 2024
-
RC1: 'Comment on egusphere-2024-1718', Anonymous Referee #1, 01 Aug 2024
- AC4: 'Reply on RC1', Gregor Ehrensperger, 11 Sep 2024
-
AC6: 'Reply on RC1', Gregor Ehrensperger, 09 Oct 2024
Please note that a (slightly) updated response including line numbers and a diff between the preprint and the revised version is available as part of the author's response.
Citation: https://doi.org/10.5194/egusphere-2024-1718-AC6
-
RC2: 'Comment on egusphere-2024-1718', Anonymous Referee #2, 11 Aug 2024
The authors present a study which uses machine learning, lightning observations and reanalysis data to find which atmospheric variables are most likely linked to the occurrence of lightning. The idea presented in the paper is good, however, the overall presentation, structure and language are making it difficult to follow the authors' thoughts and scientific results. I would also encourage the authors to be clearer about the possible applications they envisage for this work. Is it to help in the formulation of new lightning parameterisation schemes for numerical weather prediction models? Or is this a system that is meant to be used with data from sounding and/or numerical models to make predictions on lightning?
Citation: https://doi.org/10.5194/egusphere-2024-1718-RC2 -
AC5: 'Reply on RC2', Gregor Ehrensperger, 11 Sep 2024
We appreciate the concise feedback, which aligns well with the more detailed feedback by Reviewer 1.
Like you stated in the first sentence of your feedback, the study aims to identify which atmospheric patterns/variables are most likely associated with the occurrence of lightning by using explainable artificial intelligence to uncover the inner workings of a high-performance machine learning model.
Resulting applications e.g. are:
- Applying the methodology to regions where studies are scarce can accelerate scientific discovery in these areas and improve understanding of atmospheric processes.
- Existing models for lightning often require different parameterizations for ocean and land. The presented methodology might be used for gaining a more holistic understanding of the underlying atmospheric processes.
- The methodology itself is agnostic to lighting and can also be applied to other weather phenomena.
In the revised version we have comprehensively updated the abstract, Section 1 (Introduction), and Section 5 (Discussion and Conclusions) to be clearer about the study's goals. Additionally, we have heavily restructured and reworked Section 3 (Methods) and Section 4 (Results) to improve the overall presentation.
Citation: https://doi.org/10.5194/egusphere-2024-1718-AC5 -
AC7: 'Reply on RC2', Gregor Ehrensperger, 09 Oct 2024
Please note that a (slightly) updated response including line numbers and a diff between the preprint and the revised version is available as part of the author's response.
Citation: https://doi.org/10.5194/egusphere-2024-1718-AC7
-
AC5: 'Reply on RC2', Gregor Ehrensperger, 11 Sep 2024
Data sets
The ERA5 Global Reanalysis Hans Hersbach et al. http://dx.doi.org/10.1002/qj.3803
Model code and software
xai_lightningprocesses Gregor Ehrensperger et al. http://dx.doi.org/10.5281/zenodo.10899180
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
263 | 0 | 0 | 263 | 0 | 0 |
- HTML: 263
- PDF: 0
- XML: 0
- Total: 263
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1