the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Extraction of spatially confined small-scale waves from high-resolution all-sky airglow images based on machine learning
Abstract. Since June 2019, a scanning airglow camera is operated operationally every night at DLR Oberpfaffenhofen (48.09° N, 11.28° E), Germany. It provides nearly all-sky images (diameter 500 km) of the OH* airglow layer (height ca. 85–87 km) with an average spatial resolution of ca. 150 m and a temporal resolution of ca. 2 min.
We analyse about three years (941 nights between October 2020 and September 2023) of OH* airglow all-sky images for spatially confined wave structures with horizontal wavelengths of ca. 20 km and less. Such structures are often referred to as ripples and are considered to be instability structures. However, Li et al. (2017) showed that they could also be secondary waves. While ripples move with the background wind, secondary waves do not.
To identify small-scale and spatially confined structures, we adapt and train YOLOv7 (You Only Look Once, version 7), a machine learning approach, to determine their position and extent on the sky as well as their horizontal wavelength. Those wavelengths are compared to two-dimensional FFT (Fast Fourier Transform) results. We analyse the seasonal variations in the propagation direction and horizontal wavelengths of these structures and deduce that instability signatures are observed especially in summer.
Finally, we introduce a concept for “operating-on-demand” in order to derive energy dissipation rates from our measurements.
- Preprint
(2028 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-4611', Anonymous Referee #1, 13 Nov 2025
-
RC2: 'Comment on egusphere-2025-4611', Anonymous Referee #2, 15 Nov 2025
Comments on the manuscript ‘Extraction of spatially confined small-scale waves from high resolution all-sky airglow images based on machine learning’ by Sabine Wüst et al.
This paper reports the high resolution/wide area observations of OH airglow images using a scanning camera at DLR Oberpfaffenhofen, and a new method of analyzing ripple structures in the image using ML technique. The authors have also shown the statistical results of the ripples. The new analysis technique has extracted two order larger number of events than the past literatures, and the results are well compared with the past observations.
The reviewer would like to congratulate the authors’ successful observation and analyses. Th e new method applied to a wide-horizontal range and high-resolution images is very capable of studying the statistics of ripple structures (small-scale wave-like structures) in the image, for which the relations with the instabilities and secondary gravity waves are of great interest.
However, there are some points that need to be improved before the manuscript is published. Thus, I would like to recommend ‘minor revision’.
Main point:
The wording of propagation direction
There are many places where the authors mention ‘propagation direction’. I understand there are three meeting what ‘directions’ mean.
(1) Direction of so called ‘phase velocity’, which is the apparent phase velocity of phase front lines.
(2) Direction of motion of the area that the wave-like structure ‘packet’ is moving to.
(3) Direction perpendicular to the phase front line (of a single image)
My understanding from the text is that (1) is the one we normally use in case of gravity waves and 2D-FFT is showing this by its peak (with 180 ambiguities in case of a single image.) (2) and (3) can be derived from ML technique shown here. I would like to suggest the authors to use clearly different words for (2) and others, to avoid confusion of the readers. My suggestion is to use the word like ‘direction of the wave structure movement’, ‘direction of wave packet motion’, ‘direction of wave migration’, ‘direction of wave drift’ etc. Or, Li et al. (2017) uses ‘advection’, instead. I would prefer not to use the word of ‘propagation’ for (2) because it is not related to wave parameters but observed area. I hope such wording separation would help readers to understand the paper correct.
Related question.
L 410-412
‘If all the observed wave-like structures are (secondary) gravity waves, then there is no difference between the propagation directions derived from the two different approaches.’
I do not understand what this sentence means. Even in case of secondary gravity waves, wave packet motion direction may not be the same as the phase velocity of the wave. Please explain more.
Other points
- 40-49
The authors describe airglow imaging observation of breaking gravity waves with citation of a few papers. To my knowledge the first clear observation of showing gravity wave breaking by airglow and its analysis was published by Yamada+ (GRL, 2001, DOI: 10.1029/2000GL011945) and Fritts+(GRL DOI: 10.1029/2001gl013753). I would suggest to cite these papers which are earlier publication by about 20 years.
L 94-100
The description of the FAIM 4.
It would be useful if the authors can also provide chip (or pixel) size of the InGaAs camera, and F number of the lens (or the effective aperture of the lenz) for the reader to understand the sensitivity of the optics (e.g. for knowing ‘A x omega’ value (throughput) of the camera).
L 173-179
It would be helpful, if the authors briefly introduce how the FOVs of FAIM 4 and FAIM 3 (13 km x 13 km?) are different.
L 214 – 215
‘Firstly, performing a 2D-FFT, especially on high-resolution images, is time-consuming and computationally expensive, leading to longer processing times and significantly affecting efficiency in analysing large data sets.’
(similar expression is at L 325.)
My feeling is that 2D-FFT is not so time consuming nowadays, as long as the number of points (most efficient one is 2^N) is selected properly. I would like to know how difficult it is to use 2-D FFT for the images introduced here. I believe zero-padding to make a square image of (2^N) * (2^N) size would make the computation time short enogh.
L 517 – 522
The authors refer to Jacobi et al. (2015) and speculated that the meridional wind is strong in April/May, and the zonal wind is in August, which can explain the probability of dynamical instability is large. I do not understand this logic. Why the largest wind at around OH altitude shows probability of dynamical instability, without a measurement of wind shear.
L 576
‘we observe in our data changes from south in spring to east in winter’.
I cannot read the direction is south in spring from Figure 15. Please check it.
Figure 15.
Please indicate the location of center of the plot, as well as WE and NE line.
Is ‘zero’ value shifted from the center, which is my guess from the scale axis?
If so, what is the reason?
Citation: https://doi.org/10.5194/egusphere-2025-4611-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 102 | 54 | 12 | 168 | 13 | 11 |
- HTML: 102
- PDF: 54
- XML: 12
- Total: 168
- BibTeX: 13
- EndNote: 11
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript presents a method for detecting small-scale airglow wave structures using a modified YOLOv7. The paper is well written, scientifically sound, and a welcome contribution. A few concerns regarding the methodology need to be addressed, however. Below are my itemized comments.
Pretraining using BYOL may not be particularly beneficial for this application, and the manuscript provides no evidence that BYOL improves performance. Nevertheless, the authors should at least include further details on the implementation of BYOL and YOLOv7. In particular:
Has the author tried using YOLOv7 in its original form and compared the numbers?
This is not an appropriate way to report regression performance. Regression tasks should be evaluated using continuous error metrics such as MSE or RMSE, and wavelength and orientation should be reported separately with their respective error distributions. Using a binary threshold to count predictions as “correct” obscures the actual performance and does not provide enough information to assess model accuracy.
The reported performance is subpar for a task that should not be particularly difficult for a modern neural network. This suggests there might be issues with the data, the net config, and/or training. I suggest that the authors retrain the network without the additional regression features, expand the training data if possible, and include a validation set. If the size of the training dataset is the main constraint, using the testing set as the validation set and reporting the validation metrics is also acceptable.
The orientation and wavelength can be handled much more effectively by a dedicated CNN or ViT that processes the image content within the bounding box. Or even better, a DETR-based model would be more suitable for predicting both the bounding box and the orientation. However, adapting the method to DETR would require substantial additional work and is not strictly necessary here.
This is not a fair comparison. The 2D-FFT results are evaluated using an error threshold of 2.5° for orientation and 3 percent for wavelength, while the YOLOv7 results are evaluated using a much looser threshold of 10° and 10 percent. Because the criteria differ by a large factor, the “78 percent correct” numbers for the two methods cannot be directly compared.
While I understand the authors are trying to show that 2D-FFT performs better on normal images, the comparison is still pretty weird. It would be better to compare both methods under the same benchmark. MSE or RMSE is the standard metrics for regression tasks like these.