Preprints
https://doi.org/10.5194/egusphere-2026-2388
https://doi.org/10.5194/egusphere-2026-2388
29 May 2026
 | 29 May 2026
Status: this preprint is open for discussion and under review for Annales Geophysicae (ANGEO).

High-latitude auroral and cloudiness occurrence from automatic image classification

Noora Partamies and Mikko Syrjäsuo

Abstract. We have investigated auroral and cloudiness occurrence over Kjell Henriksen Observatory (KHO) in Svalbard using full-colour all-sky images from 2016-2025. Our approach focused on constructing a high-quality manually labelled training set. Images were classified as ClearAurora, ClearNoAurora, CloudyAurora, or CloudyNoAurora based on their content. As there is natural overlap between these classes, we carried out several iterative validation rounds to increase the number of high-quality sample images while removing images with unclear contents. We then evaluated different Convolutional Neural Network topologies and selected the best performing network to classify all images between January 2016 and December 2025 (over 8 million images in total). In addition to the validation accuracy with the ground truth, we also estimated the classification accuracy based on a random selection of classified images. Our final classifier, called KHOnet2026, results in accuracies from 94% to 98% depending on the score and 
image class.

We investigated auroral occurrence over Kjell Henriksen Observatory in Svalbard in 2016-2025 with data based on automatic classification of full-colour all-sky images (8.2 million images in total). We used a simple-to-use classification algorithm with several rounds of manual labelling of randomly selected individual images in 4 classes: ClearAurora, ClearNoAurora, CloudyAurora, CloudyNoAurora. In each iteration, images which were not obviously belonging to any of the four classes were removed to minimise the confusion. We therefore acknowledge that our classes naturally overlap, and that the overlap determines the highest achievable accuracy of our method, in this case 96%.

We found that most of our image data is cloudy (60-70%). A validation of the cloud occurrence results was performed with an independent dataset from a co-located cloud sensor. We found a good agreement between the two datasets at a monthly average level with a correlation coefficient of 0.86. Auroral occurrence over Svalbard is of the order of 25% of the imaging time, and it shows no solar cycle correlation but is rather modulated by the cloudiness. The portion of clear skies without aurora is only about 10%. The statistically clearest month at KHO is January, and the cloudiest is November. This automatic classification routine is set to run in real-time and further expand the database of classified images to aid researchers in finding images with aurora. This knowledge allows for a far more efficient use of computer time in analysis of the structural evolution of the aurora, when cloudy data can be excluded. Furthermore, the automatically classified images provide a very useful proxy for all other optical instruments hosted by KHO.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Noora Partamies and Mikko Syrjäsuo

Status: open (until 10 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Noora Partamies and Mikko Syrjäsuo
Noora Partamies and Mikko Syrjäsuo
Metrics will be available soon.
Latest update: 30 May 2026
Download
Short summary
We developed a method to prune colour all-sky images into classes of clear or cloudy skies with and without aurora using supervised learning and pre-trained convolutional neural networks. We investigate a 10-year database of auroral images taken from Svalbard. The method accuracy is well over 90%, and the results show that about 2/3 of auroral images are cloudy with the cloudiest month being November. Aurora are most often observed in the morning hours independent on the solar activity.
Share