the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Leveraging Machine Learning techniques and SEVIRI data to detect volcanic clouds composed of ash, ice, and SO2
Abstract. Volcanic clouds can influence the climate and pose a serious threat to air transportation. Detecting and distinguishing them from meteorological clouds is particularly challenging because they often are composed of water vapor and ice particles, along with ash and gases. This study presents a Neural Network (NN) model for the detection of volcanic clouds composed of ash, ice, and SO2, applied to data acquired by the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) satellite instrument. A dataset of 1.259 SEVIRI images related to Etna volcano eruptions spanning from 2020 to 2022, as well as 2024, was considered. The NN model, based on a multi-layer perceptron (MLP), was developed using 13 features, including thermal infrared channels and brightness temperature differences (BTD’s). The model was validated on three eruptive events not used in the training phase, demonstrating an overall high accuracy of 99 %, a precision >89 %, a recall >74 % and excellent capability to detect volcanic clouds, even in complex scenarios of high meteorological cloud cover. The results are promising for automatic and near-real-time detection of volcanic clouds, including those containing ice, and for improving retrieval processes.
- Preprint
(2670 KB) - Metadata XML
-
Supplement
(52901 KB) - BibTeX
- EndNote
Status: open (until 31 Mar 2026)
- RC1: 'Comment on egusphere-2026-727', Anonymous Referee #1, 17 Mar 2026 reply
-
RC2: 'Comment on egusphere-2026-727', Andrew Prata, 18 Mar 2026
reply
Please see attached PDF.
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 114 | 31 | 9 | 154 | 30 | 8 | 14 |
- HTML: 114
- PDF: 31
- XML: 9
- Total: 154
- Supplement: 30
- BibTeX: 8
- EndNote: 14
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Naranjo et al. report a new algorithm for volcanic cloud detection using a spaceborne visible and infra-red imager. They conclude that their approach improves on current algorithms for clouds that include a mixture of volcanic ash, ice, and SO2. The weakest performance of the new algorithm occurs at cloud edges; an expected result of weaker signal. The methods are good and the results generally supportive, so I am happy to recommend publication with revisions that address the following concerns. Overall, the manuscript was a pleasure to read.
(1) Whether or not the performance metrics in Table 5 are "high" should be considered relative to performance of previous cloud detection algorithms. The discussion (line 349) includes references to previous algorithms that fail in the presence of ice mixtures. The authors could show quantitative support for their main conclusion by running those algorithms over their validation case studies in order to make a direct comparison.
(2) There are deficiencies, and possibly some error, in the reporting of algorithm metrics that I can frame around Figure 8. The authors show metrics for the validation case studies, which has highly imbalanced classes, but not for the test subset of the labelled data which has been sample to improve balance. The latter metrics, on the balanced test set, should also be shown. It will allow readers to know how much value comes from the NN verse the post-processing. Optionally, the authors could include the metrics for each set of hyper-parameters in a supplement. The confusion matrices ought to be rotated (with "Observed" on the horizontal), normalized by the total number of sample (not the marginal totals), and the calculation checked: the counts have been normalized by the total observed in each class, so the resulting true-positive-rate (currently bottom right corner) should equal the recall in table 5. The authors need to say what threshold they used for class assignment and why.
(3) In section 4.4, the authors struggle (as do I!) to reason about the separate utility of precision and recall. Given the highly unbalanced class composition in the dataset, and the noted application to aviation safety, it may be better to simply focus on recall (the rate of false negatives).
(4) Encountering the methods and results of a final analysis in the discussion is surprising. Consider rearranging.
(5) I encourage the authors to publish any software or scripts developed for this manuscript as a code supplement. I also encourage the authors to plan for release of the labelled dataset after a sufficient period of embargo; choose a data archive that makes the labelled dataset citable and only the metadata public and discoverable. At some later date it will then be easy to make the labelled data itself open, which would be a great contribution to future research.