the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Ice crystal images from optical array probes. Compatibility of morphology specific size distributions, retrieved with specific and global Convolutional Neural Networks for HVPS, PIP, CIP, and 2DS
Abstract. The convolutional network methodology is applied to train classification tools for hydrometeor images from optical array probes. Two models were developed in a previous article for the PIP and 2DS and are further tested. Three additional models are presented: for the CIP, HVPS, and a global model trained on a data set that includes all available data from all four instruments. A methodology to retrieve morphology-specific size distributions from the OAP data is provided. Size distributions for each morphological class, obtained with the specific or global classification models, are compared for the ICE GENESIS data set, where all four probes were used simultaneously. The reliability and coherence of these newly obtained machine learning classification tools are demonstrated clearly. The analysis shows significant advantages of using the global model over the specific ones, in terms of compatibility of the size distributions. The obtained morphology-specific size distributions effectively reduce OAP data to a level of detail pertinent to systematically identify microphysical processes. This study emphasizes the potential to improve insights in ice and mixed-phase microphysics based on hydrometeor morphological classification from machine learning algorithms.
- Preprint
(15125 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1910', Anonymous Referee #1, 08 Oct 2024
- suggestion: add classification results for the 3 general types (pristine crystals, intermediary particles and ultimate precipitating particles)
- section 2: a visualization of the CNN architecture would be helpful
- section 2.2.1 / 2.2.2 / 2.2.3: How many training samples are used for each class?
- Figure 2 / 3 / 4: As I understand, the results are for the training data. Please add the results for the test dataset.
- Line 203: It is not clear how the data assimilation was performed. Are the WD test samples added to the training data? Are the same WD samples used for the results presented in figure 5b?
- Line 236: Clarify LaMP
Citation: https://doi.org/10.5194/egusphere-2024-1910-RC1 -
AC1: 'Reply on RC1', Louis Jaffeux, 17 Oct 2024
- suggestion: add classification results for the 3 general types (pristine crystals, intermediary particles and ultimate precipitating particles)
The classes that are gathered under the 3 general types may have widely different shapes, meaning they are subjected to various issues with respect to recognition. of their shapes. Therefore, adding these metrics is not pertinent to users of the trained models, neither to understand the obtained results nor to assess their performance.
- section 2: a visualization of the CNN architecture would be helpful
The article of Jaffeux et al. (2022) contains several of these representations. One of these figures will be added in a revised version of the manuscript.
- section 2.2.1 / 2.2.2 / 2.2.3: How many training samples are used for each class?
The exact number of images for each class and for each instrument is not thoroughly listed in the article, and these numbers only appear when necessary. A table can be added, with similar entries as Figure 1, but featuring these numbers instead of image examples. Readers have free access to the full, hand-labeled image folders in the dedicated Github repository.
- Figure 2 / 3 / 4: As I understand, the results are for the training data. Please add the results for the test dataset.
These figures provide the results of model predictions using an independent test data set, which was separated from the hand-labeled data set before training as described in Section 2. The overall accuracy obtained on the validation data sets is mentioned in each paragraph.
- Line 203: It is not clear how the data assimilation was performed. Are the WD test samples added to the training data? Are the same WD samples used for the results presented in figure 5b?
WD test samples were obtained as a result of specific testing, which is a qualitative check beyond the presented methodology performed when using the models on newly acquired data sets. They were added to the whole hand-labeled data set before using the same methodology. In particular, a training set was separated from the data set, used to train the model, and finally evaluated with the independent test data.
- Line 236: Clarify LaMP.
Will be clarified as Laboratoire de Météorologie Physique (LaMP).
Citation: https://doi.org/10.5194/egusphere-2024-1910-AC1
-
AC1: 'Reply on RC1', Louis Jaffeux, 17 Oct 2024
-
RC2: 'Comment on egusphere-2024-1910', Anonymous Referee #2, 17 Oct 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-1910/egusphere-2024-1910-RC2-supplement.pdf
-
AC2: 'Reply on RC2', Louis Jaffeux, 20 Nov 2024
Major Comment #1:
We have answered this comment in a previous response as follows:
“Details concerning the CNN methodology have been added in two additional paragraphs in Section 2. The results are easily reproducible thanks to open access to data sets and trained CNN models in the linked GitHub repository. Our recent paper (Jaffeux et al. 2022) is primarily presenting the methodology of the implemented CNN.”
We have added a condensed description of used CNN on half a page (2 paragraphs) in the revised manuscript, to our surprise this major comment seems to be still there. The image data selection, training, and testing methodologies are important for the present study, however, they follow point by point the methodology detailed in a dedicated and already published article (Jaffeux et al. 2022). This is the reason, why we don’t consider necessary to copy-paste details of Jaffeux et al. (2022) into this study. Additional CNN information has nonetheless been added in the form of a schematic as was stated in reply to RC1 (https://doi.org/10.5194/egusphere-2024-1910-AC1). This part of the article is easily reproducible thanks to the open access Github repository that contains all used images used for the CNN models.
The study presented here is going far beyond training of CNNs (which has been the objective in Jaffeux et al. 2022) and presents unprecedented and unique scientific results when utilizing the CNN models to OAP datasets of the measurement campaign ICE GENESIS in order to produce morphology dependent quantitative number size distributions (see the entire section 3 of this study). These size distributions are produced for 4 OAP probes with possibility to build composite morphology dependent distributions from overlapping OAP probes . This was never done before. Likewise, the latter sentences above can be added to respond to the second major comment of the reviewer concerning the novelty of the study. The presented work cannot be done from CPI image data, since the CCD camera based instrument has such a small sample volume that you can’t retrieve size distributions in a heterogeneous cloud encountered on aircraft. Of course we are aware of the fact that CPI image data have better resolution with in addition 256 grey scales, which makes CNN different (and certainly extremely valuable) for high resolution CPI imager as compared to poor resolved OAP probe black and white images. CNN models for CPI probes produce different results than do CNN models for OAP probes, however, we have to keep in mind that quantitative results presented in section 3 of this study cannot be produced with CPI image data!
Major Comment #2
Likewise, we have responded to this major comment as follows:
“The introduction has been improved in order to highlight the differences with previous methods that used OAP data. Note that obtaining morphology specific particle size distributions from OAPs (or any other instrument for that matter) was never shown in any other research publications. Przybylo et al. (2022) used CPI images, which cannot be used for quantitative concentration measurements (neither concentrations nor size distributions if not averaged over long time periods which is not compatible with cloud spatial heterogeneity encountered with aircraft); in particular, they do not produce particle size spectra (compare the 3 million usable images they obtained from eleven field campaigns to the 2.3 million used here and obtained in 7 flight hours; see next point).”
The introduction mentions the few studies that have successfully applied CNNs to OAP images (Jaffeux et al. (2022) and Zhang et al. (2023)). Since OAP images they have very limited pixel numbers and also resolution, the CNN method for OAP images might be more challenging since the information content is not as rich as it is for CPI images (Pryzybylo et al. (2022)), and of course hardly comparable with diffraction patterns (Schmidt et al. (2024a,b)). Nevertheless, we will mention in the introduction these studies, having used a very similar methodology for the shape retrievals. These points will be made in the introduction as the methodology used in both papers is indeed similar. With the respect to novelty, the methodology was used for the first time to successfully train a single model based on image data sets from 4 different models of probes with different image resolution and pixel numbers. In addition, Jaffeux et al. (2022) and Zhang et al. (2023) limited their methodology to single probe data sets. Section 3 is beyond existing work on OAP based morphological analysis, as it takes advantage of the morphological information extracted from a coherent dataset using four different probes to reconstruct quantitative morphology-specific size distributions. These are presented for the first time and provide new insights in shape recognition with respect to observable size range and pixel resolution. Moreover, they are of great interest to the understanding of ice microphysics. The introduction will be corrected to reflect the importance of these latter points and in order to add some more selected relevant references.
Minor Comment #1
Acronyms are now defined from the beginning (which includes the abstract). In particular, probe and institute names will be fully written, the first time they are used.
Minor Comment #2
Grammar has been checked and corrected when needed
Citation: https://doi.org/10.5194/egusphere-2024-1910-AC2
-
AC2: 'Reply on RC2', Louis Jaffeux, 20 Nov 2024
Data sets
Public GitHub repository with data sets, codes, and trained CNN models Louis Jaffeux https://github.com/LJaffeux/JAFFEUX_et_al_AMT_2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
176 | 47 | 289 | 512 | 4 | 1 |
- HTML: 176
- PDF: 47
- XML: 289
- Total: 512
- BibTeX: 4
- EndNote: 1
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1